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BACKGROUND OF THE INVENTION 

The genus Moraxella is a member of the family Neisseriaceae. The 10 species of this 
genus are separated into 2 subgenera, Moraxella (rods) and Branhamella (cocci). Moraxella 
are gram-negative, aerobic, oxidase-positive, and usually catalase-postive. (Bovre, K. 1984. 
5 Genus II. Moraxella Lwoff 1939, 173 emend. Henriksen and Bovre 1968, 391, 105. Krieg 
and Holt (editors) In Bergey's Manual of Systematic Bacteriology, 1 :296-303.). Moraxella 
catarrhalis, a member of the subgenera Branhamella, was previously called Branhamella 
catarrhalis and Neisseria catarrhalis. 

Moraxella catarrhalis is frequently isolated from the nasal cavity of humans, and 

10 until recently, was considered a nonpathogenic commensal of the upper respiratory tract. 
Currently it is most important lower respiratory pathogen after S.pneumoniae and K 
influenzae (Doren, G., et al, 1986. Diagn. Microbiol. Infect. Dis. 4:191-201.). It is a 
common cause of otitis media in children, acute bronchitis or pneumonia in adults, and 
sinusitis (Wood, G., et al, 1996. Clin. Infect. Dis. 22:632-636.). Bacteremia, meningitis, 

1 5 skeletal infections and endocarditis due to M catarrhalis are rare, but are observed in 
immunocompromised individuals (Aebi, C, et al, 1998. Infect. Immun. 66:540-548.). 
Concern for M catarrhalis infections of cystic fibrosis (CF) patients is growing. Damage to 
the respiratory tract by M. catarrhalis could promote invasion by other pathogens such as P. 
aeruginosa in CF patients. (Deneuville, E., et al, 1995. ACTA Paediatr. 84:1212.). M. 

20 catarrhalis is also associated with acute laryngitis. In one study, 50% of patients with acute 
laryngitis were colonized with M. catarrhalis (Hoi, C, et al, 1996. Journal of Infectious 
Diseases. 174:636-638.), while isolates from healthy adults occur at the rate of 6% -1 1%. 
The colonization rates of children can be much higher, with average rates of 30%-35% 
(Sehgal, SC. et al, 1994. Infection 22:193-196.). In some hospitals, M. catarrhalis accounts 

25 for half of all the respiratory infections (Bluesone, C, et al, 1992. Pediatr. Infect. Dis. J. 
11:S7-S11.). 

Increasing levels of antibiotic resistance have been observed in clinical isolates of M. 
catarrhalis recently. Before 1980, less than 10% of M. catarrhalis isolates were JJ- 
lactamase-positive. Currently, most clinical isolates produce fi-lactamase, making them 
30 resistant to fl-lactam antibiotics such as penicillin. (Doern, G., et al, 1996. Antimicob. 
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Agents Chemother. 40:2884-2886.). M. catarrhalis is intrinsically resistant to a small group 
of drugs that include vancomycin and trimethoprim (Wallace, RJ. 1990. Am. J. Med. 
88:46S-50S), and is becoming increasingly resistant to sulfamethoxazole, oral 
cephalosporins, and macrolides (Hoppe, HL. 1998. Am J. Health. Syst. Pharm. 55:1881-97). 

Although, M catarrhalis was once considered only as part of the nonpathogenic flora 
of the upper respiratory tract, it is emerging as an important respiratory pathogen. Currently, 
it is the third leading cause of lower respiratory tract infections and otitis media. Sequencing 
and further analysis of this genome will aid in identification of essential genes for 
development of drug targets, and reduce the health threat this organism poses. 



SUMMARY OF THE INVENTION 

The present invention fulfills the need for diagnostic tools and therapeutics by 
providing bacterial-specific compositions and methods for detecting Moraxella species 
including M. catarrhalis, as well as compositions and methods useful for treating and 
1 5 preventing Moraxella infection, in particular, M. catarrhalis infection, in vertebrates 
including mammals. 

The present invention encompasses isolated nucleic acids and polypeptides derived 
from M catarrhalis that are useful as reagents for diagnosis of bacterial disease, components 
of effective antibacterial vaccines, and/or as targets for antibacterial drugs including anti-M 

20 catarrhalis drugs. They can also be used to detect the presence of M catarrhalis and other 
Moraxella species in a sample; and in screening compounds for the ability to interfere with 
the M. catarrhalis life cycle or to inhibit M. catarrhalis infection. They also have use as 
biocontrol agents for plants. 

In one aspect, the invention features compositions of nucleic acids corresponding to 

25 entire coding sequences of M. catarrhalis proteins (SEQ ID NO: 1 - SEQ ID NO: 1920) , 
including surface or secreted proteins or parts thereof, nucleic acids capable of binding 
mRNA from M. catarrhalis proteins to block protein translation, and methods for producing 
M catarrhalis proteins or parts thereof using peptide synthesis and recombinant DNA 
techniques. This invention also features antibodies and nucleic acids useful as probes to 
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detect M. catarrhalis infection. In addition, vaccine compositions and methods for the 
protection or treatment of infection by M. catarrhalis are within the scope of this invention. 

The nucleotide sequences provided in SEQ ID NO: 1 - SEQ ID NO: 1920, a fragment 
thereof, or a nucleotide sequence at least about 99.5% identical to a sequence contained 
5 within SEQ ID NO: 1 - SEQ ID NO: 1920 may be "provided" in a variety of medias to 
facilitate use thereof. As used herein, "provided" refers to a manufacture, other than an 
isolated nucleic acid molecule, which contains a nucleotide sequence of the present 
invention, i.e., the nucleotide sequence provided in SEQ ID NO: 1 - SEQ ID NO: 1920, a 
fragment thereof, or a nucleotide sequence at least about 99.5% identical to a sequence 

10 contained within SEQ ID NO: 1 - SEQ ID NO: 1920. Uses for and methods for providing 
nucleotide sequences in a variety of media is well known in the art (see e.g., EPO 
Publication No. EP 0 756 006). 

In one application of this embodiment, a nucleotide sequence of the present invention 
can be recorded on computer readable media. As used herein, "computer readable media" 

1 5 refers to any media which can be read and accessed directly by a computer. Such media 
include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage media, and magnetic tape; optical storage media such as CD-ROM; electrical storage 
media such as RAM and ROM; and hybrids of these categories such as magnetic/optical 
storage media. A person skilled in the art can readily appreciate how any of the presently 

20 known computer readable media can be used to create a manufacture comprising computer 
readable media having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer 
readable media. A person skilled in the art can readily adopt any of the presently known 
methods for recording information on computer readable media to generate manufactures 

25 comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a person skilled in the art for 
creating a computer readable media having recorded thereon a nucleotide sequence of the 
present invention. The choice of the data storage structure will generally be based on the 
means chosen to access the stored information. In addition, a variety of data processor 

30 programs and formats can be used to store the nucleotide sequence information of the 
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present invention on computer readable media. The sequence information can be 
represented in a word processing text file, formatted in commercially-available software 
such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A person skilled in the 
5 art can readily adapt any number of data processor structuring formats (e.g. text file or 

database) in order to obtain computer readable media having recorded thereon the nucleotide 
sequence information of the present invention. 

By providing the nucleotide sequence of SEQ ID NO: 1 - SEQ ID NO: 1920, a 
fragment thereof, or a nucleotide sequence at least about 99.5% identical to SEQ ID NO: 1 - 

10 SEQ ID NO: 1920 in computer readable form, a person skilled in the art can routinely access 
the coding sequence information for a variety of purposes. Computer software is publicly 
available which allows a person skilled in the art to access sequence information provided in 
a computer readable media. Examples of such computer software include programs of the 
"Staden Package", "DNA Star", "MacVector", GCG "Wisconsin Package" (Genetics 

15 Computer Group, Madison, WI) and "NCBI Toolbox" (National Center For Biotechnology 
Information). Suitable programs are described, for example, in Martin J. Bishop, ed., Guide 
to Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); and 
Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
Tools for Genomic and Molecular Research, American Society for Microbiology, 

20 Washington, D.C. (1997), 

Computer algorithms enable the identification of M. catarrhalis open reading frames 
(ORFs) within SEQ ID NO: 1 - SEQ ID NO: 1920 which contain homology to ORFs or 
proteins from other organisms. Examples of such similarity-search algorithms include the 
BLAST [Altschul et ah, J. Mol. Biol. 215:403-410 (1990)] and Smith- Waterman [Smith and 

25 Waterman (1981) Advances in Applied Mathematics, 2:482-489] search algorithms. 

Suitable search algorithms are described, for example, in Martin J. Bishop, ed., Guide to 
Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); and 
Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
Tools for Genomic and Molecular Research, American Society for Microbiology, 

30 Washington, D.C. (1997). Such algorithms are utilized on computer systems as exemplified 
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below. The ORFs so identified represent protein encoding fragments within the M 
catarrhalis genome and M. catarrhalis plasmidsand are useful in producing commercially 
important proteins such as enzymes used in fermentation reactions and in the production of 
commercially useful metabolites. 
5 The present invention further provides systems, particularly computer-based systems, 

which contain the sequence information described herein. Such systems are designed to 
identify commercially important fragments of the M. catarrhalis genome and plasmids. As 
used herein, "a computer-based system" refers to the hardware means, software means, and 
data storage means used to analyze the nucleotide sequence information of the present 

10 invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A person skilled in the art can readily appreciate that any one of the currently 
available computer-based systems is suitable for use in the present invention. The computer- 
based systems of the present invention comprise a data storage means having stored therein a 

1 5 nucleotide sequence of the present invention and the necessary hardware means and software 
means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded 
thereon the nucleotide sequence information of the present invention. 

20 As used herein, "search means" refers to one or more programs which are 

implemented on the computer-based system to compare a target sequence or target structural 
motif with the sequence information stored within the data storage means. Search means are 
used to identify fragments or regions of the M catarrhalis genome and plasmids which are 
similar to, or "match", a particular target sequence or target motif. A variety of known 

25 algorithms are known in the art and have been disclosed publicly, and a variety of 

commercially available software for conducting homology-based similarity searches are 
available and can be used in the computer-based systems of the present invention. Examples 
of such software includes, but is not limited to, FASTA (GCG Wisconsin Package), Bic SW 
(Compugen Bioccelerator), BLASTN2, BLASTP2, BLASTX2 (NCBI) and Motifs (GCG). 

30 Suitable software programs are described, for example, in Martin J. Bishop, ed., Guide to 
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Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); and 
Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
Tools for Genomic and Molecular Research, American Society for Microbiology, 
Washington, D.C. (1 997). A person skilled in the art can readily recognize that any one of 
5 the available algorithms or implementing software packages for conducting homology 
searches can be adapted for use in the present computer-based systems. 

As used herein, a "target sequence" can be any DNA or amino acid sequence of six or 
more nucleotides or two or more amino acids. A person skilled in the art can readily 
recognize that the longer a target sequence is, the less likely a target sequence will be present 

10 as a random occurrence in the database. The most preferred sequence length of a target 

sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. 
However, it is well recognized that many genes are longer than 500 amino acids, or 1 .5 kb in 
length, and that commercially important fragments of the M. catarrhalis genome and 
plasmids from M. catarrhalis, such as sequence fragments involved in gene expression and 

1 5 protein processing, will often be shorter than 30 nucleotides. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on 
a specific functional domain or three-dimensional configuration which is formed upon the 
folding of the target polypeptide. There are a variety of target motifs known in the art. 

20 Protein target motifs include, but are not limited to, enzymatic active sites, membrane- 
spanning regions, and signal sequences. Nucleic acid target motifs include, but are not 
limited to, promoter sequences, hairpin structures and inducible expression elements (protein 
binding sequences). 

A variety of structural formats for the input and output means can be used to input 
25 and output the information in the computer-based systems of the present invention. A 
preferred format for an output means ranks fragments of the M. catarrhalis genome and 
plasmids possessing varying degrees of homology to the target sequence or target motif. 
Such presentation provides a person skilled in the art with a ranking of sequences which 
contain various amounts of the target sequence or target motif and identifies the degree of 
30 homology contained in the identified fragment. 
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A variety of comparing means can be used to compare a target sequence or target 
motif with the data storage means to identify sequence fragments of the M catarrhalis 
genome and plasmids. In the present examples, implementing software which implement the 
BLASTP2 and bic_SW algorithms (Altschul et al. 5 J Mol. Biol. 215:403-410 (1990); 
5 Compugen Biocellerator) was used to identify open reading frames within the M. catarrhalis 
genome and plasmids. A person skilled in the art can readily recognize that any one of the 
publicly available homology search programs can be used as the search means for the 
computer-based systems of the present invention. Suitable programs are described, for 
example, in Martin J. Bishop, ed., Guide to Human Genome Computing, 2d Edition, 

10 Academic Press, San Diego, CA. (1998); and Leonard F. Peruski, Jr., and Anne Harwood 
Peruski, The Internet and the New Biology: Tools for Genomic and Molecular Research, 
American Society for Microbiology, Washington, D.C. (1997). 

The invention features M catarrhalis polypeptides, preferably a substantially pure 
preparation of an M. catarrhalis polypeptide, or a recombinant M. catarrhalis polypeptide. 

15 In preferred embodiments: the polypeptide has biological activity; the polypeptide has an 

amino acid sequence at least about 60%, 70%, 80%, 90%, 95%, 98%, or 99% identical to an 
amino acid sequence of the invention contained in the Sequence Listing, preferably it has 
about 65% sequence identity with an amino acid sequence of the invention contained in the 
Sequence Listing, and most preferably it has about 92% to about 99% sequence identity with 

20 an amino acid sequence of the invention contained in the Sequence Listing; the polypeptide 
has an amino acid sequence essentially the same as an amino acid sequence of the invention 
contained in the Sequence Listing; the polypeptide is at least about 5, 10, 20, 50, 100, or 150 
amino acid residues in length; the polypeptide includes at least about 5, preferably at least 
about 10, more preferably at least about 20, still more preferably at least about 50, 100, or 

25 1 50 contiguous amino acid residues of the invention contained in the Sequence Listing. In 
yet another preferred embodiment, the amino acid sequence which differs in sequence 
identity by about 7% to about 8% from the M. catarrhalis amino acid sequences of the 
invention contained in the Sequence Listing is also encompassed by the invention. 

In preferred embodiments: the M. catarrhalis polypeptide is encoded by a nucleic 

30 acid of the invention contained in the Sequence Listing, or by a nucleic acid having at least 



Applicant's Docket No.: PATH03-14 



about 60%, 70%, 80%, 90%, 95%, 98%, or 99% sequence identity or % homology with a 
nucleic acid of the invention contained in the Sequence Listing. 

In a preferred embodiment, the subject M. catarrhalis polypeptide differs in amino 
acid sequence at about 1, 2, 3, 5, 10 or more residues from a sequence of the invention 
5 contained in the Sequence Listing. The differences, however, are such that the M 

catarrhalis polypeptide exhibits an M. catarrhalis biological activity, e.g., the M. catarrhalis 
polypeptide retains a biological activity of a naturally occurring M. catarrhalis enzyme. 

In preferred embodiments, the polypeptide includes all or a fragment of an amino 
acid sequence of the invention contained in the Sequence Listing; fused, in reading frame, to 
10 additional amino acid residues, preferably to residues encoded by genomic DNA 5' or 3' to 
the genomic DNA which encodes a sequence of the invention contained in the Sequence 
Listing. 

In yet other preferred embodiments, the M catarrhalis polypeptide is a recombinant 
fusion protein having a first M. catarrhalis polypeptide portion and a second polypeptide 

15 portion, e.g., a second polypeptide portion having an amino acid sequence unrelated to M 

catarrhalis . The second polypeptide portion can be, e.g., any of glutathione-S-transferase, a 
DNA binding domain, or a polymerase activating domain. In preferred embodiment the 
fusion protein can be used in a two-hybrid assay. 

Polypeptides of the invention include those which arise as a result of alternative 

20 transcription events, alternative RNA splicing events, and alternative translational and 
postranslational events. 

In a preferred embodiment, the encoded M. catarrhalis polypeptide differs (e.g., by 
amino acid substitution, addition or deletion of at least one amino acid residue) in amino 
acid sequence at about 1, 2, 3, 5, 10 or more residues, from a sequence of the invention 

25 contained in the Sequence Listing. The differences, however, are such that: the M 

catarrhalis encoded polypeptide exhibits an M catarrhalis biological activity, e.g., the 
encoded M. catarrhalis enzyme retains a biological activity of a naturally occurring M 
catarrhalis . 

In preferred embodiments, the encoded polypeptide includes all or a fragment of an 
30 amino acid sequence of the invention contained in the Sequence Listing; fused, in reading 
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frame, to additional amino acid residues, preferably to residues encoded by genomic DNA 5' 
or 3' to the genomic DNA which encodes a sequence of the invention contained in the 
Sequence Listing. 

The M. catarrhalis strain, 98-4362, from which genomic sequences have been 
5 sequenced, has been deposited on July 20, 1998, in the American Type Culture Collection 
and assigned the ATCC designation # 202156. 

Included in the invention are: allelic variations; natural mutants; induced mutants; 
proteins encoded by DNA that hybridize under high or low stringency conditions to a nucleic 
acid which encodes a polypeptide of the invention contained in the Sequence Listing (for 
10 definitions of high and low stringency see Current Protocols in Molecular Biology, John 
Wiley & Sons, New York, 1989, 6.3.1 - 6.3.6, hereby incorporated by reference); and, 
polypeptides specifically bound by antisera to M. catarrhalis polypeptides, especially by 
antisera to an active site or binding domain of M. catarrhalis polypeptide. The invention 
also includes fragments, preferably biologically active fragments. These and other 
15 polypeptides are also referred to herein as M catarrhalis polypeptide analogs or variants. 

The invention further provides nucleic acids, e.g., RNA or DNA and their respective 
complements, encoding a polypeptide of the invention. This includes double stranded 
nucleic acids as well as coding and antisense single strands. 

In preferred embodiments, the subject M catarrhalis nucleic acid will include a 
20 transcriptional regulatory sequence, e.g., at least one of a transcriptional promoter or 

transcriptional enhancer sequence, operably linked to the M. catarrhalis gene sequence, e.g., 
to render the M catarrhalis gene sequence suitable for expression in a recombinant host cell. 

In yet a further preferred embodiment, the nucleic acid which encodes an M 
catarrhalis polypeptide of the invention, hybridizes under stringent conditions to a nucleic 
25 acid probe corresponding to at least about 8 consecutive nucleotides of the invention 
contained in the Sequence Listing; more preferably to at least about 12 consecutive 
nucleotides of the invention contained in the Sequence Listing; still more preferably to at 
least about 20 consecutive nucleotides of the invention contained in the Sequence Listing; 
most preferably to at least about 40 consecutive nucleotides of the invention contained in the 
30 Sequence Listing. 
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In another aspect, the invention provides a substantially pure nucleic acid having a 
nucleotide sequence which encodes an M. catarrhalis polypeptide. In preferred 
embodiments: the encoded polypeptide has biological activity; the encoded polypeptide has 
an amino acid sequence at least about 60%, 70%, 80%, 90%, 95%, 98% or 99% homologous 
5 to an amino acid sequence of the invention contained in the Sequence Listing; the encoded 
polypeptide has an amino acid sequence essentially the same as an amino acid sequence of 
the invention contained in the Sequence Listing; the encoded polypeptide is at least about 5, 
10, 20, 50, 100, or 150 amino acids in length; the encoded polypeptide comprises at least 
about 5, preferably at least about 10, more preferably at least about 20, still more preferably 
10 at least about 50, 100, or 150 contiguous amino acids of the invention contained in the 
Sequence Listing. 

In another aspect, the invention encompasses: a vector including a nucleic acid 
which encodes an M catarrhalis polypeptide or an M catarrhalis polypeptide variant as 
described herein; a host cell transfected with the vector; and a method of producing a 

1 5 recombinant M. catarrhalis polypeptide or M catarrhalis polypeptide variant; including 
culturing the cell, e.g., in a cell culture medium, and isolating an M. catarrhalis or M 
catarrhalis polypeptide variant, e.g., from the cell or from the cell culture medium. 

One embodiment of the invention is directed to substantially isolated nucleic acids. 
Nucleic acids of the invention include sequences comprising at least about 8 nucleotides in 

20 length, more preferably at least about 12 nucleotides in length, even more preferably at least 
about 15-20 nucleotides in length, that correspond to a subsequence of any one of SEQ ID 
NO: 1 - SEQ ID NO: 1920 or complements thereof. Alternatively, the nucleic acids 
comprise sequences contained within any ORF (open reading frame), including a complete 
protein-coding sequence, of which any of SEQ ID NO: 1 - SEQ ID NO: 1920 forms a part. 

25 The invention encompasses sequence-conservative variants and function-conservative 

variants of these sequences. The nucleic acids may be DNA, RNA, DNA/RNA duplexes, 
protein-nucleic acid (PNA), or derivatives thereof. 

In another aspect, the invention features a purified recombinant nucleic acid having at 
least about 50%, 60%, 70%, 80%, 90%, 95%, 98%, or 99% homology with a sequence of the 

30 invention contained in the Sequence Listing 
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The invention also encompasses recombinant DNA (including DNA cloning and 
expression vectors) comprising these M. catarrhalis -derived sequences; host cells 
comprising such DNA, including fungal, bacterial, yeast, plant, insect, and mammalian host 
cells; and methods for producing expression products comprising RNA and polypeptides 
5 encoded by the M. catarrhalis sequences. These methods are carried out by incubating a 
host cell comprising an M catarrhalis -derived nucleic acid sequence under conditions in 
which the sequence is expressed. The host cell may be native or recombinant. The 
polypeptides can be obtained by (a) harvesting the incubated cells to produce a cell fraction 
and a medium fraction; and (b) recovering the M. catarrhalis polypeptide from the cell 
1 0 fraction, the medium fraction, or both. The polypeptides can also be made by in vitro 
translation. 

In another aspect, the invention features nucleic acids capable of binding mRNA of 
M catarrhalis . Such nucleic acid is capable of acting as antisense nucleic acid to control 
the translation of mRNA of M. catarrhalis . A further aspect features a nucleic acid which is 
15 capable of binding specifically to an M. catarrhalis nucleic acid. These nucleic acids are 
also referred to herein as complements and have utility as probes and as capture reagents. 

In another aspect, the invention features an expression system comprising an open 
reading frame corresponding to M catarrhalis nucleic acid. The nucleic acid further 
comprises a control sequence compatible with an intended host. The expression system is 
20 useful for making polypeptides corresponding to M catarrhalis nucleic acid. 

In another aspect, the invention encompasses: a vector including a nucleic acid which 
encodes an M catarrhalis polypeptide or an M. catarrhalis polypeptide variant as described 
herein; a host cell transfected with the vector; and a method of producing a recombinant M 
catarrhalis polypeptide or M catarrhalis polypeptide variant; including culturing the cell, 
25 e.g., in a cell culture medium, and isolating the M. catarrhalis or M. catarrhalis polypeptide 
variant, e.g., from the cell or from the cell culture medium. 

In yet another embodiment of the invention encompasses reagents for detecting 
bacterial infection, including M catarrhalis infection, which comprise at least one M. 
catarrhalis -derived nucleic acid defined by any one of SEQ ID NO: 1 - SEQ ID NO: 1920, 
30 or sequence-conservative or function-conservative variants thereof. Alternatively, the 
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diagnostic reagents comprise nucleotide sequences that are contained within any open 
reading frames (ORFs), including preferably complete protein-coding sequences, contained 
within any of SEQ ID NO: 1 - SEQ ID NO: 1920, or polypeptide sequences contained within 
any of SEQ ID NO: 1921 - SEQ ID NO: 3840, or polypeptides of which any of the above 
5 sequences forms a part, or antibodies directed against any of the above peptide sequences or 
function-conservative variants and/or fragments thereof. 

The invention further provides antibodies, preferably monoclonal antibodies, which 
specifically bind to the polypeptides of the invention. Methods are also provided for 
producing antibodies in a host animal. The methods of the invention comprise immunizing 

10 an animal with at least one M. catarrhalis -derived immunogenic component, wherein the 
immunogenic component comprises one or more of the polypeptides encoded by any one of 
SEQ ID NO: 1 - SEQ ID NO: 1920 or sequence-conservative or function-conservative 
variants thereof; or polypeptides that are contained within any ORFs, including complete 
protein-coding sequences, of which any of SEQ ID NO: 1 - SEQ ID NO: 1920 forms a part; 

15 or polypeptide sequences contained within any of SEQ ID NO: 1921 - SEQ ID NO: 3840; or 
polypeptides of which any of SEQ ID NO: 1921 - SEQ ID NO: 3840 forms a part. Host 
animals include any warm blooded animal, including without limitation mammals and birds. 
Such antibodies have utility as reagents for immunoassays to evaluate the abundance and 
distribution of M. catarrhalis -specific antigens. 

20 In yet another aspect, the invention provides diagnostic methods for detecting M 

catarrhalis antigenic components or anti-M catarrhalis antibodies in a sample. M 
catarrhalis antigenic components may be detected by known processes, including but not 
limited to detection by a process comprising: (i) contacting a sample suspected to contain a 
bacterial antigenic component with a bacterial-specific antibody, under conditions in which a 

25 stable antigen-antibody complex can form between the antibody and bacterial antigenic 
components in the sample; and (ii) detecting any antigen-antibody complex formed in step 
(i), wherein detection of an antigen-antibody complex indicates the presence of at least one 
bacterial antigenic component in the sample. In different embodiments of this method, the 
antibodies used are directed against a sequence encoded by any of SEQ ID NO: 1 - SEQ ID 

30 NO: 1920 or sequence-conservative or function-conservative variants thereof, or against a 
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polypeptide sequence contained in any of SEQ ID NO: 1921 - SEQ ID NO: 3840 or 
function-conservative variants thereof. 

In yet another aspect, the invention provides a method for detecting antibacterial- 
specific antibodies in a sample, which comprises: (i) contacting a sample suspected to 
5 contain antibacterial-specific antibodies with an M. catarrhalis antigenic component, under 
conditions in which a stable antigen-antibody complex can form between the M catarrhalis 
antigenic component and antibacterial antibodies in the sample; and (ii) detecting any 
antigen-antibody complex formed in step (i), wherein detection of an antigen-antibody 
complex indicates the presence of antibacterial antibodies in the sample. In different 

10 embodiments of this method, the antigenic component is encoded by a sequence contained in 
any of SEQ ID NO: 1 - SEQ ID NO: 1920 or sequence-conservative and function- 
conservative variants thereof, or is a polypeptide sequence contained in any of SEQ ID NO: 
1921 - SEQ ID NO: 3840 or function-conservative variants thereof. 

In another aspect, the invention features a method of generating vaccines for 

1 5 immunizing an individual against M. catarrhalis . The method includes: immunizing a 
subject with an M catarrhalis polypeptide, e.g., a surface or secreted polypeptide, or a 
combination of such peptides or active portion(s) thereof, and a pharmaceutically acceptable 
carrier. Such vaccines have therapeutic and prophylactic utilities. 

In another aspect, the invention features a method of evaluating a compound, e.g., a 

20 polypeptide, e.g., a fragment of a host cell polypeptide, for the ability to bind an M 

catarrhalis polypeptide. The method includes contacting the compound to be evaluated with 
an M. catarrhalis polypeptide and determining if the compound binds or otherwise interacts 
with the M. catarrhalis polypeptide. Compounds which bind or otherwise interact with M 
catarrhalis polypeptides are candidates as modulators, including activators and inhibitors, of 

25 the bacterial life cycle. These assays can be performed in vitro or in vivo. 

In another aspect, the invention features a method of evaluating a compound, e.g., a 
polypeptide, e.g., a fragment of a host cell polypeptide, for the ability to bind an M 
catarrhalis nucleic acid, e.g., DNA or RNA. The method includes contacting the compound 
to be evaluated with an M catarrhalis nucleic acid and determining if the compound binds 

30 or otherwise interacts with the M. catarrhalis nucleic acid. Compounds which bind M. 
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catarrhalis are candidates as modulators, including activators and inhibitors, of the bacterial 
life cycle. These assays can be performed in vitro or in vivo. 

A particularly preferred embodiment of the invention is directed to a method of 
screening test compounds for anti -bacterial activity, which method comprises: selecting as a 
5 target a bacterial specific sequence, which sequence is essential to the viability of a bacterial 
species; contacting a test compound with said target sequence; and selecting those test 
compounds which bind to said target sequence as potential anti-bacterial candidates. In one 
embodiment, the target sequence selected is specific to a single species, or even a single 
strain, such as, for example, the strain M. catarrhalis98-4362. In a second embodiment, the 
10 target sequence is common to at least two species of bacteria. In a third embodiment, the 
target sequence is common to a family of bacteria. The target sequence may be a nucleic 
acid sequence or a polypeptide sequence. Methods employing sequences common to more 
than one species of microorganism may be used to screen candidates for broad spectrum 
anti-bacterial activity. 

1 5 The invention also provides methods for preventing or treating disease caused by 

certain bacteria, including M catarrhalis , which are carried out by administering to an 
animal in need of such treatment, in particular a warm-blooded vertebrate, including but not 
limited to birds and mammals, a compound that specifically inhibits or interferes with the 
function of a bacterial polypeptide or nucleic acid. In a particularly preferred embodiment, 

20 the mammal to be treated is human. 

DETAILED DESCRIPTION OF THE INVENTION 

The sequences of the present invention include the specific nucleic acid and amino 
acid sequences set forth in the Sequence Listing that forms a part of the present specification, 
25 and which are designated SEQ ID NO: 1 - SEQ ID NO: 3840. Use of the terms "SEQ ID 
NO: 1 - SEQ ID NO: 1920 " SEQ ID NO: 1921 - SEQ ID NO: 3840, "the sequences 
depicted in Table 2", etc., is intended, for convenience, to refer to each individual SEQ ID 
NO individually, and is not intended to refer to the genus of these sequences unless such 
reference would be indicated. In other words, it is a shorthand for listing all of these 
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sequences individually. The invention encompasses each sequence individually, as well as 
any combination thereof. 

DEFINITIONS 

5 "Nucleic acid" or "polynucleotide" as used herein refers to purine- and pyrimidine- 

containing polymers of any length, either polyribonucleotides or polydeoxyribonucleotides 
or mixed polyribo-polydeoxyribo nucleotides. This includes single- and double-stranded 
molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA hybrids, as well as "protein nucleic 
acids" (PNA) formed by conjugating bases to an amino acid backbone. This also includes 

10 nucleic acids containing modified bases. 

A nucleic acid or polypeptide sequence that is "derived from" a designated sequence 
refers to a sequence that corresponds to a region of the designated sequence. For nucleic 
acid sequences, this encompasses sequences that are homologous or complementary to the 
sequence, as well as "sequence-conservative variants" and "function-conservative variants." 

15 For polypeptide sequences, this encompasses "function-conservative variants." Sequence- 
conservative variants are those in which a change of one or more nucleotides in a given 
codon position results in no alteration in the amino acid encoded at that position. Function- 
conservative variants are those in which a given amino acid residue in a polypeptide has 
been changed without altering the overall conformation and function of the native 

20 polypeptide, including, but not limited to, replacement of an amino acid with one having 
similar physico-chemical properties (such as, for example, acidic, basic, hydrophobic, and 
the like). "Function-conservative" variants also include any polypeptides that have the 
ability to elicit antibodies specific to a designated polypeptide. 

An "M catarrhalis -derived" nucleic acid or polypeptide sequence may or may not be 

25 present in other bacterial species, and may or may not be present in all M. catarrhalis strains. 
This term is intended to refer to the source from which the sequence was originally isolated. 
Thus, an M. catarrhalis -derived polypeptide, as used herein, may be used, e.g., as a target to 
screen for a broad spectrum antibacterial agent, to search for homologous proteins in other 
species of bacteria or in eukaryotic organisms such as bacteria humans, etc. 
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A purified or isolated polypeptide or a substantially pure preparation of a polypeptide 
are used interchangeably herein and, as used herein, mean a polypeptide that has been 
separated from other proteins, lipids, and nucleic acids with which it naturally occurs. 
Preferably, the polypeptide is also separated from substances, e.g., antibodies or gel matrix, 
5 e.g., polyacrylamide, which are used to purify it. Preferably, the polypeptide constitutes at 
least about 10, 20, 50 70, 80 or 95% dry weight of the purified preparation. Preferably, the 
preparation contains sufficient polypeptide to allow protein sequencing; at least about 1,10, 
or preferably 100 mg of polypeptide. 

A purified preparation of cells refers to, in the case of plant or animal cells, an in 

10 vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells 
or microbial cells, it consists of a preparation of at least about 1 0%, more preferably at least 
about 50%, of the subject cells. 

A purified or isolated or a substantially pure nucleic acid, e.g., a substantially pure 
DNA, (are terms used interchangeably herein) is a nucleic acid which is one or both of the 

1 5 following: not immediately contiguous with both of the coding sequences with which it is 
immediately contiguous (i.e., one at the 5' end and one at the 3' end) in the naturally- 
occurring genome and plasmids of the organism from which the nucleic acid is derived; or 
which is substantially free of a nucleic acid with which it occurs in the organism from which 
the nucleic acid is derived. The term includes, for example, a recombinant DNA which is 

20 incorporated into a vector, e.g., into an autonomously replicating plasmid or virus, or into the 
genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a 
cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) 
independent of other DNA sequences. Substantially pure DNA also includes a recombinant 
DNA which is part of a hybrid gene encoding additional M. catarrhalis DNA sequence. 

25 A "contig" as used herein is a nucleic acid representing a continuous stretch of 

genomic sequence of an organism. 

An "open reading frame", also referred to herein as ORF ? is a region of nucleic acid 
which encodes a polypeptide. This region may represent a portion of a coding sequence or a 
total sequence and can be determined from a stop to stop codon or from a start to stop codon. 
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As used herein, a "coding sequence" is a nucleic acid which is transcribed into 
messenger RNA and/or translated into a polypeptide when placed under the control of 
appropriate regulatory sequences. The boundaries of the coding sequence are determined by 
a translation start codon at the five prime terminus and a translation stop code at the three 
5 prime terminus. A coding sequence can include but is not limited to messenger RNA, 
synthetic DNA, and recombinant nucleic acid sequences. 

A "complement" of a nucleic acid as used herein refers to an anti-parallel or antisense 
sequence that participates in Watson-Crick base-pairing with the original sequence. 

A "gene product" is a protein or structural RNA which is specifically encoded by a 

10 gene. 

As used herein, the term "probe" refers to a nucleic acid, peptide or other chemical 
entity which specifically binds to a molecule of interest. Probes are often associated with or 
capable of associating with a label. A label is a chemical moiety capable of detection. 
Typical labels comprise dyes, radioisotopes, luminescent and chemiluminescent moieties, 

15 fluorophores, enzymes, precipitating agents, amplification sequences, and the like. 

Similarly, a nucleic acid, peptide or other chemical entity which specifically binds to a 
molecule of interest and immobilizes such molecule is referred herein as a "capture ligand". 
Capture ligands are typically associated with or capable of associating with a support such as 
nitro-cellulose, glass, nylon membranes, beads, particles and the like. The specificity of 

20 hybridization is dependent on conditions such as the base pair composition of the 

nucleotides, and the temperature and salt concentration of the reaction. These conditions are 
readily discernable to one of ordinary skill in the art using routine experimentation. 

"Homologous" refers to the sequence similarity or sequence identity between two 
polypeptides or between two nucleic acid molecules. When a position in both of the two 

25 compared sequences is occupied by the same base or amino acid monomer subunit, e.g., if a 
position in each of two DNA molecules is occupied by adenine, then the molecules are 
homologous at that position. The percent of homology between two sequences is a function 
of the number of matching or homologous positions shared by the two sequences divided by 
the number of positions compared x 100. For example, if 6 of 10 of the positions in two 

30 sequences are matched or homologous then the two sequences are 60% homologous. By 
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way of example, the DNA sequences ATTGCC and TATGGC share 50% homology. 
Generally, a comparison is made when two sequences are aligned to give maximum 
homology. 

Nucleic acids are hybridizable to each other when at least one strand of a nucleic acid 
5 can anneal to the other nucleic acid under defined stringency conditions. Stringency of 

hybridization is determined by: (a) the temperature at which hybridization and/or washing is 
performed; and (b) the ionic strength and polarity of the hybridization and washing solutions. 
Hybridization requires that the two nucleic acids contain complementary sequences- 
depending on the stringency of hybridization, however, mismatches may be tolerated. 
10 Typically, hybridization of two sequences at high stringency (such as, for example, in a 
solution of 0.5X SSC, at 65° C) requires that the sequences be essentially completely 
homologous. Conditions of intermediate stringency (such as, for example, 2X SSC at 65 ° 
C) and low stringency (such as, for example 2X SSC at 55° C) require correspondingly less 
overall complementarity between the hybridizing sequences. (IX SSC is 0.15 M NaCl, 
15 0.015 MNa citrate). 

The terms peptides, proteins, and polypeptides are used interchangeably herein. 

As used herein, the term "surface protein" refers to all surface accessible proteins, 
e.g. inner and outer membrane proteins, proteins adhering to the cell wall, and secreted 
proteins. 

20 A polypeptide has M catarrhalis biological activity if it has one, two or preferably 

more of the following properties: (1) if when expressed in the course of an M. catarrhalis 
infection, it can promote, or mediate the attachment of M. catarrhalis to a cell; (2) it has an 
enzymatic activity, structural or regulatory function characteristic of an M catarrhalis 
protein; (3) the gene which encodes it can rescue a lethal mutation in an M catarrhalis gene. 

25 A polypeptide has biological activity if it is an antagonist, agonist, or super-agonist of a 
polypeptide having one of the above-listed properties. 

A biologically active fragment or analog is one having an in vivo or in vitro activity 
which is characteristic of the M catarrhalis polypeptides of the invention contained in the 
Sequence Listing, or of other naturally occurring M catarrhalis polypeptides, e.g., one or 

30 more of the biological activities described herein. Especially preferred are fragments which 
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exist in vivo, e.g., fragments which arise from post transcriptional processing or which arise 
from translation of alternatively spliced RNA's. Fragments include those expressed in native 
or endogenous cells as well as those made in expression systems, e.g., in CHO (Chinese 
Hamster Ovary) cells. Because peptides such as M catarrhalis polypeptides often exhibit a 
5 range of physiological properties and because such properties may be attributable to different 
portions of the molecule, a useful M. catarrhalis fragment or M. catarrhalis analog is one 
which exhibits a biological activity in any biological assay for M catarrhalis activity. The 
fragment or analog possesses about 10%, preferably about 40%, more preferably about 60%, 
70%, 80% or 90% or greater of the activity of M. catarrhalis , in any in vivo or in vitro 
10 assay. 

Analogs can differ from naturally occurring M catarrhalis polypeptides in amino 
acid sequence or in ways that do not involve sequence, or both. Non-sequence modifications 
include changes in acetylation, methylation, phosphorylation, carboxylation, or 
glycosylation. Preferred analogs include M catarrhalis polypeptides (or biologically active 

1 5 fragments thereof) whose sequences differ from the wild-type sequence by one or more 
conservative amino acid substitutions or by one or more non-conservative amino acid 
substitutions, deletions, or insertions which do not substantially diminish the biological 
activity of the M. catarrhalis polypeptide. Conservative substitutions typically include the 
substitution of one amino acid for another with similar characteristics, e.g., substitutions 

20 within the following groups: valine, glycine; glycine, alanine; valine, isoleucine, leucine; 
aspartic acid, glutamic acid; asparagine, glutamine; serine, threonine; lysine, arginine; and 
phenylalanine, tyrosine. Other conservative substitutions can be made in view of the table 
below. 

25 TABLE 1 

CONSERVATIVE AMINO ACID REPLACEMENTS 



For Amino Acid 


Code 


Replace with any of 


Alanine 


A 


D-Ala, Gly, beta-Ala, L-Cys, D-Cys 
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Areinine 


R 


D-Arg, Lys, D-Lys, homo-Arg, D-homo-Arg, Met, He, 
D-Met, D-Ile, Orn, D-Orn 


Asparagine 


N 


D-Asn, Asp, D-Asp, Glu, D-Glu, Gin, D-Gln 


Aspartic Acid 


D 


D-Asp, D-Asn, Asn, Glu, D-Glu, Gin, D-Gln 


Cysteine 


c 

L, 


r\ p lffl O A/fa, f^i , n 1V/f£v+ TA A /f TU V pv TTU*' 

D-uys, b-Me-uys, Met, D-Met, i nr, u- 1 nr 


Glutamine 


r\ 


JJ-vjin, Asn, D-Asn, OIU, D-LrlU, Asp, u-Asp 


Glutamic Acid 


E 


D-Glu, D-Asp, Asp, Asn, D-Asn, Gin, D-Gln 


Glycine 


G 


Ala, D-Ala, Pro, D-Pro, 0-Ala, Acp 


Isoleucine 


I 


D-Ile, Val, D-Val, Leu, D-Leu, Met, D-Met 


Leucine 


L 


D-Leu, Val, D-Val, Leu, D-Leu, Met, D-Met 


Lysine 


K 


D-Lys, Arg, D-Arg, homo-Arg, D-homo-Arg, Met, D- 
Met, He, D-Ile, Orn, D-Orn 


Methionine 


M 


D-Met, S-Me-Cys, He, D-Ile, Leu, D-Leu, Val, D-Val 


Phenylalanine 


F 


D-Phe, Tyr, D-Thr, L-Dopa, His, D-His, Trp, D-Trp, 
Trans-3,4, or 5-phenylproline, cis-3,4, or 5- 
phenylproline 


Proline 


P 


D-Pro, L-I-thioazolidine-4-carboxylic acid, D-or L-l- 
oxazolidine-4-carboxylic acid 


Serine 


S 


D-Ser, Thr, D-Thr, allo-Thr, Met, D-Met, Met(O), 
D-Met(O), L-Cys, D-Cys 


Threonine 


T 


D-Thr, Ser, D-Ser, allo-Thr, Met, D-Met, Met(O), 
D-Met(O), Val, D-Val 


i yrosine 


v 
i 


n T\rr Ph<- n PVip» T Hnn',j Fl T-Tic 

u- i yr, rnt, u-rne, i>-j_7upd, ois, i J-nis 


Valine 


V 


D-Val, Leu, D-Leu, He, D-Ile, Met, D-Met 
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Other analogs within the invention are those with modifications which increase 
peptide stability; such analogs may contain, for example, one or more non-peptide bonds 
(which replace the peptide bonds) in the peptide sequence. Also included are: analogs that 
include residues other than naturally occurring L-amino acids, e.g., D-amino acids or non- 
5 naturally occurring or synthetic amino acids, e.g., (3 or y amino acids; and cyclic analogs. 

As used herein, the term "fragment", as applied to an M. catarrhalis analog, will 
ordinarily be at least about 20 residues, more typically at least about 40 residues, preferably 
at least about 60 residues in length. Fragments of M. catarrhalis polypeptides can be 
generated by methods known to those skilled in the art. The ability of an Moraxella 
10 fragment to exhibit a biological activity of M catarrhalis polypeptide can be assessed by 
methods known to those skilled in the art as described herein. Also included are M 
catarrhalis polypeptides containing residues that are not required for biological activity of 
the peptide or that result from alternative mRNA splicing or alternative protein processing 
events. 

1 5 An "immunogenic component" as used herein is a moiety, such as an M catarrhalis 

polypeptide, analog or fragment thereof, that is capable of eliciting a humoral and/or cellular 
immune response in a host animal. 

An "antigenic component" as used herein is a moiety, such as an M catarrhalis 
polypeptide, analog or fragment thereof, that is capable of binding to a specific antibody with 
20 sufficiently high affinity to form a detectable antigen-antibody complex. 

The term "antibody" as used herein is intended to include fragments thereof which 
are specifically reactive with M catarrhalis polypeptides. 

As used herein, the term "cell-specific promoter" means a DNA sequence that serves 
as a promoter, i.e., regulates expression of a selected DNA sequence operably linked to the 
25 promoter, and which effects expression of the selected DNA sequence in specific cells of a 
tissue. The term also covers so-called "leaky" promoters, which regulate expression of a 
selected DNA primarily in one tissue, but cause expression in other tissues as well. 

Misexpression, as used herein, refers to a non-wild type pattern of gene expression. 
It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of 
30 expression that differs from wild type in terms of the time or stage at which the gene is 
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expressed, e.g., increased or decreased expression (as compared with wild type) at a 
predetermined developmental period or stage; a pattern of expression that differs from wild 
type in terms of increased expression (as compared with wild type) in a predetermined cell 
type or tissue type; a pattern of expression that differs from wild type in terms of the splicing 
5 size, amino acid sequence, post-translational modification, or biological activity of the 
expressed polypeptide; a pattern of expression that differs from wild type in terms of the 
effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., 
a pattern of increased or decreased expression (as compared with wild type) in the presence 
of an increase or decrease in the strength of the stimulus. 

10 As used herein, "host cells" and other such terms denoting microorganisms or higher 

eukaryotic cell lines cultured as unicellular entities refers to cells which can become or have 
been used as recipients for a recombinant vector or other transfer DNA, and include the 
progeny of the original cell which has been transfected. It is understood by individuals 
skilled in the art that the progeny of a single parental cell may not necessarily be completely 

1 5 identical in genomic or total DNA compliment to the original parent, due to accident or 
deliberate mutation. 

As used herein, the term "control sequence" refers to a nucleic acid having a base 
sequence which is recognized by the host organism to effect the expression of encoded 
sequences to which they are ligated. The nature of such control sequences differs depending 

20 upon the host organism; in prokaryotes, such control sequences generally include a promoter, 
ribosomal binding site, terminators, and in some cases operators; in eukaryotes, generally 
such control sequences include promoters, terminators and in some instances, enhancers. The 
term control sequence is intended to include at a minimum, all components whose presence 
is necessary for expression, and may also include additional components whose presence is 

25 advantageous, for example, leader sequences. 

As used herein, the term "operably linked" refers to sequences joined or ligated to 
function in their intended manner. For example, a control sequence is operably linked to 
coding sequence by ligation in such a way that expression of the coding sequence is achieved 
under conditions compatible with the control sequence and host cell. 



-23- 



Applicant's Docket No.: PATH03-14 



The "metabolism" of a substance, as used herein, means any aspect of the expression, 
function, action, or regulation of the substance. The metabolism of a substance includes 
modifications, e.g., covalent or non-covalent modifications of the substance. The metabolism 
of a substance includes modifications, e.g., covalent or non-covalent modification, the 
5 substance induces in other substances. The metabolism of a substance also includes changes 
in the distribution of the substance. The metabolism of a substance includes changes the 
substance induces in the distribution of other substances. 

A "sample" as used herein refers to a biological sample, such as, for example, tissue 
or fluid isloated from an individual (including without limitation plasma, serum, 

10 cerebrospinal fluid, lymph, tears, saliva and tissue sections) or from in vitro cell culture 
constituents, as well as samples from the environment. 

Technical and scientific terms used herein have the meanings commonly understood 
by one of ordinary skill in the art to which the present invention pertains, unless otherwise 
defined. Reference is made herein to various methodologies known to those of skill in the 

1 5 art. Publications and other materials setting forth such known methodologies to which 

reference is made are incorporated herein by reference in their entireties as though set forth 
in full. The practice of the invention will employ, unless otherwise indicated, conventional 
techniques of chemistry, molecular biology, microbiology, recombinant DNA, and 
immunology, which are within the skill of the art. Such techniques are explained fully in the 

20 literature. See e.g., Sambrook, Fritsch, and Maniatis, Mo lecular Cloning; Laboratory 
Manual 2nd ed. (1989); DNA Cloning, Volumes I and II (D.N Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & 
S.J. Higgins eds. 1984); the series, Methods in Enzymoloqy (Academic Press, Inc.), 
particularly Vol. 154 and Vol. 155 (Wu and Grossman, eds.); PCR-A Practical Approach 

25 (McPherson, Quirke, and Taylor, eds., 1991); Immunology, 2d Edition, 1989, Roitt et aL, 
C.V. Mosby Company, and New York; Advanced Immunology, 2d Edition, 1991, Male et 
aL, Grower Medical Publishing, New York.; DNA Cloning: A Practical Approach, Volumes 
I and II, 1985 (D.N. Glover ed.); Oligonucleotide Synthesis^ 1984, (M.L. Gait ed); 
Transcription and Translation, 1984 (Hames and Higgins eds.); Animal Cell Culture, 1986 

30 (R.I. Freshney ed.); Immobilized Cells and Enzymes, 1986 (IRL Press); Perbal, 1984, A 
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Practical Guide to Molecular Cloning; Gene Transfer Vectors for Mammalian Cells, 1987 
(J. H. Miller and M. P. Calos eds., Cold Spring Harbor Laboratory); Martin J. Bishop, ed., 
Guide to Human Genome Computing, 2d Edition, Academic Press, San Diego, CA. (1998); 
and Leonard F. Peruski, Jr., and Anne Harwood Peruski, The Internet and the New Biology: 
5 Tools for Genomic and Molecular Research, American Society for Microbiology, 
Washington, D.C. (1997). 

Any suitable materials and/or methods known to those of skill can be utilized in 
carrying out the present invention; however, preferred materials and/or methods are 
described. Materials, reagents and the like to which reference is made in the following 
10 description and examples are obtainable from commercial sources, unless otherwise noted. 

M CATARRHALIS GENOMIC SEQUENCE 

This invention provides nucleotide sequences of the genome of M. catarrhalis which 

thus comprises a DNA sequence library of M. catarrhalis genomic DNA. The detailed 
1 5 description that follows provides nucleotide sequences of M catarrhalis , and also describes 

how the sequences were obtained and how ORFs and protein-coding sequences were 

identified. Also described are compositions and methods of using the disclosed M. 

catarrhalis sequences in methods including diagnostic and therapeutic applications. 

Furthermore, the library can be used as a database for identification and comparison of 
20 medically important sequences in this and other strains of M. catarrhalis . 

To determine the genomic sequence of M catarrhalis, DNA from strain 98-4362. of 

M. catarrhalis was isolated and a library of DNA fragments were transformed into DH5a 

cells. DNA sequencing was achieved using established ABI sequencing methods on ABB 77 

automated DNA sequencers. The cloning and sequencing procedures are described in more 
25 detail in the Exemplification. 

Individual sequence reads were assembled using PHRAP (P. Green, Abstracts of 

DOE Human Genome Program Contractor-Grantee Workshop V, Jan. 1996, p. 157). The 

average contig length was about 3-4 kb. 
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All subsequent steps were based on sequencing by ABI377 automated DNA 
sequencing methods. The cloning and sequencing procedures are described in more detail in 
the Exemplification. 

A variety of approaches may be used to order the contigs so as to obtain a continuous 
5 sequence representing the entire M caiarrhalis genome. Synthetic oligonucleotides are 
designed that are complementary to sequences at the end of each contig. These 
oligonucleotides may be hybridized to libaries of M. catarrhalis genomic DNA in, for 
example, lambda phage vectors or plasmid vectors to identify clones that contain sequences 
corresponding to the junctional regions between individual contigs. Such clones are then 

10 used to isolate template DNA and the same oligonucleotides are used as primers in 

polymerase chain reaction (PCR) to amplify junctional fragments, the nucleotide sequence of 
which is then determined. 

The M. catarrhalis sequences were analyzed for the presence of open reading frames 
(ORFs) comprising at least 180 nucleotides. As a result of the analysis of ORFs based on 

1 5 stop-to-stop codon reads, it should be understood that these ORFs may not correspond to the 
ORF of a naturally-occurring M catarrhalis polypeptide. These ORFs may contain start 
codons which indicate the initiation of protein synthesis of a naturally-occurring M 
catarrhalis polypeptide. Such start codons within the ORFs provided herein were identified 
by those of ordinary skill in the relevant art, and the resulting ORF and the encoded M 

20 catarrhalis polypeptide is within the scope of this invention. For example, within the ORFs 
a codon such as AUG or GUG (encoding methionine or valine) which is part of the initiation 
signal for protein synthesis were identified and the portion of an ORF to corresponding to a 
naturally-occurring M. catarrhalis polypeptide was recognized. The predicted coding 
regions were defined by evaluating the coding potential of such sequences with the program 

25 GENEMARK™ (Borodovsky and Mclninch, 1993, Comp. . 17:123). 

Each predicted ORF amino acid sequence was compared with all sequences found in 
current GENBANK, SWISS-PROT, and PIR databases using the BLAST algorithm. BLAST 
identifies local alignments occurring by chance between the ORF sequence and the sequence 
in the databank (Altschal et al., 1990, L Mol. Biol. 215:403-410). Homologous ORFs 

30 (probabilities less than 10" 5 by chance) andORF's that are probably non-homologous 
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(probabilities greater than 10" by chance) but have good codon usage were identified. Both 
homologous, sequences and non-homologous sequences with good codon usage, are likely to 
encode proteins and are encompassed by the invention. 

5 MCA TARRHALIS NUCLEIC ACIDS 

The present invention provides a library of M. catarrhalis -derived nucleic acid 
sequences. The libraries provide probes, primers, and markers which are used as markers in 
epidemiological studies. The present invention also provides a library of M. catarrhalis - 
derived nucleic acid sequences which comprise or encode targets for therapeutic drugs. 

10 The nucleic acids of this invention may be obtained directly from the DNA of the 

above referenced M. catarrhalis strain by using the polymerase chain reaction (PCR). See 
"PCR, A Practical Approach" (McPherson, Quirke, and Taylor, eds., IRL Press, Oxford, 
UK, 1991) for details about the PCR. High fidelity PCRis used to ensure a faithful DNA 
copy prior to expression. In addition, the authenticity of amplified products is verified by 

1 5 conventional sequencing methods. Clones carrying the desired sequences described in this 
invention may also be obtained by screening the libraries by means of the PCR or by 
hybridization of synthetic oligonucleotide probes to filter lifts of the library colonies or 
plaques as known in the art (see, e.g., Sambrook et al., Molecular Cloning, A Laboratory 
Manual 2nd edition, 1989, Cold Spring Harbor Press, NY). 

20 It is also possible to obtain nucleic acids encoding M catarrhalis polypeptides from a 

cDNA library in accordance with protocols herein described. A cDNA encoding an M. 
catarrhalis polypeptide can be obtained by isolating total mRNA from an appropriate strain. 
Double stranded cDNAs can then be prepared from the total mRNA. Subsequently, the 
cDNAs can be inserted into a suitable plasmid or viral (e.g., bacteriophage) vector using any 

25 one of a number of known techniques. Genes encoding M. catarrhalis polypeptides can also 
be cloned using established polymerase chain reaction techniques in accordance with the 
nucleotide sequence information provided by the invention. The nucleic acids of the 
invention can be DNA or RNA. Preferred nucleic acids of the invention are contained in the 
Sequence Listing. 
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The nucleic acids of the invention can also be chemically synthesized using standard 
techniques. Various methods of chemically synthesizing polydeoxynucleotides are known, 
including solid-phase synthesis which, like peptide synthesis, has been fully automated in 
commercially available DNA synthesizers (See e.g., Itakura et al. U.S. Patent No. 4,598,049; 
5 Caruthers et al. U.S. Patent No. 4,458,066; and Itakura U.S. Patent Nos. 4,401 ,796 and 
4,373,07 1 , incorporated by reference herein). 

In another example, DNA can be chemically synthesized using, e.g., the 
phosphoramidite solid support method of Matteucci et al, 1981, J. Am. Chem. Soc. 
103:3185, the method of Yoo et al, 1989, J. Biol Chem. 764:17078, or other well known 
10 methods. This can be done by sequentially linking a series of oligonucleotide cassettes 
comprising pairs of synthetic oligonucleotides, as described below. 

Nucleic acids isolated or synthesized in accordance with features of the present 
invention are useful, by way of example, without limitation, as probes, primers, capture 
ligands, antisense genes and for developing expression systems for the synthesis of proteins 
15 and peptides corresponding to such sequences. As probes, primers, capture ligands and 

antisense agents, the nucleic acid normally consists of all or part (approximately twenty or 
more nucleotides for specificity as well as the ability to form stable hybridization products) 
of the nucleic acids of the invention contained in the Sequence Listing. These uses are 
described in further detail below. 

20 

PROBES 

A nucleic acid isolated or synthesized in accordance with the sequence of the 
invention contained in the Sequence Listing can be used as a probe to specifically detect M. 
catarrhalis . With the sequence information set forth in the present application, sequences 
25 of twenty or more nucleotides are identified which provide the desired inclusivity and 
exclusivity with respect to M. catarrhalis , and extraneous nucleic acids likely to be 
encountered during hybridization conditions. More preferably, the sequence will comprise at 
least about twenty to thirty nucleotides to convey stability to the hybridization product 
formed between the probe and the intended target molecules. 
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Sequences larger than 1000 nucleotides in length are difficult to synthesize but can be 
generated by recombinant DNA techniques. Individuals skilled in the art will readily 
recognize that the nucleic acids, for use as probes, can be provided with a label to facilitate 
detection of a hybridization product. 
5 Nucleic acid isolated and synthesized in accordance with the sequence of the 

invention contained in the Sequence Listing can also be useful as probes to detect 
homologous regions (especially homologous genes) of other Moraxella species using 
appropriate stringency hybridization conditions as described herein. 

10 CAPTURE LIGAND 

For use as a capture ligand, the nucleic acid selected in the manner described above 
with respect to probes, can be readily associated with a support. The manner in which 
nucleic acid is associated with supports is well known. Nucleic acid having twenty or more 
nucleotides in a sequence of the invention contained in the Sequence Listing have utility to 

1 5 separate M catarrhalis nucleic acid from one strain from the nucleic acid of other another 
strain as well as from other organisms. Nucleic acid having twenty or more nucleotides in a 
sequence of the invention contained in the Sequence Listing can also have utility to separate 
other Moraxella species from each other and from other organisms. Preferably, the sequence 
will comprise at least about twenty nucleotides to convey stability to the hybridization 

20 product formed between the probe and the intended target molecules. Sequences larger than 
1000 nucleotides in length are difficult to synthesize but can be generated by recombinant 
DNA techniques. 

PRIMERS 

25 Nucleic acid isolated or synthesized in accordance with the sequences described 

herein have utility as primers for the amplification of M catarrhalis nucleic acid. These 
nucleic acids may also have utility as primers for the amplification of nucleic acids in other 
Moraxella species. With respect to polymerase chain reaction (PCR) techniques, nucleic 
acid sequences of > 10-15 nucleotides of the invention contained in the Sequence Listing 

30 have utility in conjunction with suitable enzymes and reagents to create copies of M. 
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catarrhalis nucleic acid. More preferably, the sequence will comprise twenty or more 
nucleotides to convey stability to the hybridization product formed between the primer and 
the intended target molecules. Binding conditions of primers greater than 1 00 nucleotides 
are more difficult to control to obtain specificity. High fidelity PCR can be used to ensure a 
5 faithful DNA copy prior to expression. In addition, amplified products can be checked by 
conventional sequencing methods. 

The copies can be used in diagnostic assays to detect specific sequences, including 
genes from M catarrhalis and/or other Moraxella species. The copies can also be 
incorporated into cloning and expression vectors to generate polypeptides corresponding to 
10 the nucleic acid synthesized by PCR, as is described in greater detail herein. 

The nucleic acids of the present invention find use as templates for the recombinant 
production of M catarrhalis -derived peptides or polypeptides 

ANTISENSE 

15 Nucleic acid or nucleic acid-hybridizing derivatives isolated or synthesized in 

accordance with the sequences described herein have utility as antisense agents to prevent 
the expression of M. catarrhalis genes. These sequences also have utility as antisense agents 
to prevent expression of genes of other Moraxella species. 

In one embodiment, nucleic acid or derivatives corresponding to M catarrhalis 

20 nucleic acids is loaded into a suitable carrier such as a liposome or bacteriophage for 
introduction into bacterial cells. For example, a nucleic acid having twenty or more 
nucleotides is capable of binding to bacteria nucleic acid or bacteria messenger RNA. 
Preferably, the antisense nucleic acid is comprised of 20 or more nucleotides to provide 
necessary stability of a hybridization product of non-naturally occurring nucleic acid and 

25 bacterial nucleic acid and/or bacterial messenger RNA. Nucleic acid having a sequence 
greater than 1000 nucleotides in length is difficult to synthesize but can be generated by 
recombinant DNA techniques. Methods for loading antisense nucleic acid in liposomes is 
known in the art as exemplified by U.S. Patent 4,241,046 issued December 23, 1980 to 
Papahadjopoulos et al. 
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The present invention encompasses isolated polypeptides and nucleic acids derived 
from M. catarrhalis that are useful as reagents for diagnosis of bacterial infection, 
components of effective anti-bacterial vaccines, and/or as targets for anti-bacterial drugs, 
including anti-M catarrhalis drugs. 

5 

EXPRESSION OF M CATARRHALIS NUCLEIC ACIDS 

Table 2, which is appended herewith and which forms part of the present 
specification, provides a list of open reading frames (ORFs) in both strands and a putative 
identification of the particular function of a polypeptide which is encoded by each ORF, 

1 0 based on the homology match (determined by the BLASTP2 algorithm) of the predicted 

polypeptide with known proteins encoded by ORFs in other organisms. An ORF is a region 
of nucleic acid which encodes a polypeptide. This region may represent a portion of a 
coding sequence or a total sequence and was determined from stop to stop codons. The first 
column contains a designation for the ORF ("ORF Name"). The second and third columns 

1 5 list the SEQ ID numbers for the nucleic acid ("NT ID") and amino acid ("AA ID") 

sequences corresponding to each ORF, respectively. The fourth and fifth columns list the 
length of the nucleic acid ORF ("NT Length") and the length of the amino acid ORF ("AA 
Length "), respectively. The nucleotide sequence corresponding to each ORF begins at the 
first nucleotide immediately following a stop codon and ends at the nucleotide immediately 

20 preceding the next downstream stop codon in the same reading frame. It will be recognized 
by one skilled in the art that the natural translation initiation sites will correspond to ATG, 
GTG, or TTG codons located within the ORFs. The natural initiation sites depend not only 
on the sequence of a start codon but also on the context of the DNA sequence adjacent to the 
start codon. Usually, a recognizable ribosome binding site is found within 20 nucleotides 

25 upstream from the initiation codon. In some cases where genes are translationally coupled 
and coordinately expressed together in "operons", ribosome binding sites are not present, but 
the initiation codon of a downstream gene may occur very close to, or overlap, the stop 
codon of the an upstream gene in the same operon. The correct start codons can be generally 
identified without undue experimentation because only a few codons need be tested. It is 

30 recognized that the translational machinery in bacteria initiates all polypeptide chains with 
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the amino acid methionine, regardless of the sequence of the start codon. In some cases, 
polypeptides are post-translationally modified, resulting in an N-terminal amino acid other 
than methionine in vivo. The sixth and seventh columns provide metrics for assessing the 
likelihood of the homology match (determined by the BLASTP2 algorithm), as is known in 
5 the art, to the genes indicated in the description frame ("Description") defined further below. 
These genes in the Description were identified when the designated ORF was compared 
against a comprehensive non-redundant protein database. Specifically, the sixth column 
represents the Blast Score ("Score") for the match (a higher score is a better match), and the 
seventh column represents the probability ("Probability") for the match (the probability that 

10 such a match can have occurred by chance; the lower the value, the more likely the match is 
valid). If a BLASTP2 score of less than 100 was obtained, no value is reported in the table. 
The remaining fields below the columns contain additional information relating to the 
potential function of the sequence based on the BLASTP2 analysis. Where a match was 
discovered, the field "Protein name" list the protein's name identified from the match. In 

15 addition, one skilled in the art would be able to identify the match and elucidate its function 
using the "Locus name" and where available the accession number, " Acc#" from the 
database. Lastly, one skilled in the art would appreciate the "Description" field to further 
describe the potential function of the protein based on this analysis. This information allows 
one of ordinary skill in the art to determine a potential use for each identified coding 

20 sequence and, as a result, allows to use the polypeptides of the present invention for 
commercial and industrial purposes. 

Using the information provided in SEQ ID NO: 1 - SEQ ID NO: 1920, SEQ ID NO: 
1921 - SEQ ID NO: 3840 and in Table 2 together with routine cloning and sequencing 
methods, one of ordinary skill in the art will be able to clone and sequence all the nucleic 

25 acid fragments of interest including open reading frames (ORFs) encoding a large variety of 
proteins of M catarrhalis. 

Nucleic acid isolated or synthesized in accordance with the sequences described 
herein have utility to generate polypeptides. The nucleic acid of the invention exemplified in 
SEQ ID NO: 1 - SEQ ID NO: 1920 and in Table 2 or fragments of said nucleic acid 

30 encoding active portions of M. catarrhalis polypeptides can be cloned into suitable vectors 
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or used to isolate nucleic acid. The isolated nucleic acid is combined with suitable DNA 
linkers and cloned into a suitable vector. 

The function of a specific gene or operon can be ascertained by expression in a 
bacterial strain under conditions where the activity of the gene product(s) specified by the 
5 gene or operon in question can be specifically measured. Alternatively, a gene product may 
be produced in large quantities in an expressing strain for use as an antigen, an industrial 
reagent, for structural studies, etc. This expression can be accomplished in a mutant strain 
which lacks the activity of the gene to be tested, or in a strain that does not produce the same 
gene product(s). This includes, but is not limited to, Eucaryotic species such as the yeast 

1 0 Saccharomyces cerevisiae, Methanobacterium strains or other Archaea, and Eubacteria such 
as E. coli, B. Subtilis, S. Aureus, S. Pneumonia or Pseudomonas putida. In some cases the 
expression host will utilize the natural M catarrhalis promoter whereas in others, it will be 
necessary to drive the gene with a promoter sequence derived from the expressing organism 
(e.g., an E. coli beta-galactosidase promoter for expression in E. coli). 

15 To express a gene product using the natural M catarrhalis promoter, a procedure 

such as the following can be used. A restriction fragment containing the gene of interest, 
together with its associated natural promoter element and regulatory sequences (identified 
using the DNA sequence data) is cloned into an appropriate recombinant plasmid containing 
an origin of replication that functions in the host organism and an appropriate selectable 

20 marker. This can be accomplished by a number of procedures known to those skilled in the 
art. It is most preferably done by cutting the plasmid and the fragment to be cloned with the 
same restriction enzyme to produce compatible ends that can be ligated to join the two 
pieces together. The recombinant plasmid is introduced into the host organism by, for 
example, electroporation and cells containing the recombinant plasmid are identified by 

25 selection for the marker on the plasmid. Expression of the desired gene product is detected 
using an assay specific for that gene product. 

In the case of a gene that requires a different promoter, the body of the gene (coding 
sequence) is specifically excised and cloned into an appropriate expression plasmid. This 
subcloning can be done by several methods, but is most easily accomplished by PCR 
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amplification of a specific fragment and ligation into an expression plasmid after treating the 
PCR product with a restriction enzyme or exonuclease to create suitable ends for cloning. 

A suitable host cell for expression of a gene can be any procaryotic or eucaryotic cell. 
Suitable methods for transforming host cells can be found in Sambrook et al. ( Molecular 
5 Cloning: A Laboratory Manual , 2nd Edition, Cold Spring Harbor Laboratory Press (1989)), 
and other laboratory textbooks. 

For example, a host cell transfected with a nucleic acid vector directing expression of 
a nucleotide sequence encoding an M catarrhalis polypeptide can be cultured under 
appropriate conditions to allow expression of the polypeptide to occur. Suitable media for 

10 cell culture are well known in the art. Polypeptides of the invention can be isolated from cell 
culture medium, host cells, or both using techniques known in the art for purifying proteins 
including ion-exchange chromatography, gel filtration chromatography, ultrafiltration, 
electrophoresis, and immunoaffinity purification with antibodies specific for such 
polypeptides. Additionally, in many situations, polypeptides can be produced by chemical 

15 cleavage of a native protein (e.g., tryptic digestion) and the cleavage products can then be 
purified by standard techniques. 

In the case of membrane bound proteins, these can be isolated from a host cell by 
. contacting a membrane-associated protein fraction with a detergent forming a solubilized 
complex, where the membrane-associated protein is no longer entirely embedded in the 

20 membrane fraction and is solubilized at least to an extent which allows it to be 

chromatographically isolated from the membrane fraction. Chromatographic techniques 
which can be used in the final purification step are known in the art and include hydrophobic 
interaction, lectin affinity, ion exchange, dye affinity and immunoaffinity. 

One strategy to maximize recombinant M catarrhalis peptide expression in E. coli is 

25 to express the protein in a host bacteria with an impaired capacity to proteolytically cleave 
the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in 
Enzymology 185 , Academic Press, San Diego, California (1990) 1 19-128). Another strategy 
would be to alter the nucleic acid encoding an M catarrhalis peptide to be inserted into an 
expression vector so that the individual codons for each amino acid would be those 

30 preferentially utilized in highly expressed E. coli proteins (Wada et al., (1992) Nuc. Acids 
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Res. 20:21 11-2118). Such alteration of nucleic acids of the invention can be carried out by 
standard DNA synthesis techniques. 

The nucleic acids of the invention can also be chemically synthesized using standard 
techniques. Various methods of chemically synthesizing polydeoxynucleotides are known, 
5 including solid-phase synthesis which, like peptide synthesis, has been fully automated in 
commercially available DNA synthesizers (See, e.g., Itakura et al. U.S. Patent No. 
4,598,049; Caruthers et al. U.S. Patent No. 4,458,066; and Itakura U.S. Patent Nos. 
4,401 ,796 and 4,373,071 , incorporated by reference herein). 

The present invention provides a library of M catarrhalis -derived nucleic acid 

10 sequences. The libraries provide probes, primers, and markers which can be used as markers 
in epidemiological studies. The present invention also provides a library of M catarrhalis - 
derived nucleic acid sequences which comprise or encode targets for therapeutic drugs. 

Nucleic acids comprising any of the sequences disclosed herein or sub-sequences 
thereof can be prepared by standard methods using the nucleic acid sequence information 

15 provided in SEQ ID NO: 1 - SEQ ID NO: 1920. For example, DNA can be chemically 

synthesized using, e.g., the phosphoramidite solid support method of Matteucci et al, 1981, 
J. Am. Chem. Soc. 103:3185, the method of Yoo et al, 1989, 1 Biol Chem. 764:17078, or 
other well known methods. This can be done by sequentially linking a series of 
oligonucleotide cassettes comprising pairs of synthetic oligonucleotides, as described below. 

20 Of course, due to the degeneracy of the genetic code, many different nucleotide 

sequences can encode polypeptides having the amino acid sequences defined by SEQ ID 
NO: 1921 - SEQ ID NO: 3840 or sub-sequences thereof The codons can be selected for 
optimal expression in prokaryotic or eukaryotic systems. Such degenerate variants are also 
encompassed by this invention. 

25 Insertion of nucleic acids (typically DNAs) encoding the polypeptides of the 

invention into a vector is easily accomplished when the termini of both the DNAs and the 
vector comprise compatible restriction sites. If this cannot be done, it may be necessary to 
modify the termini of the DNAs and/or vector by digesting back single-stranded DNA 
overhangs generated by restriction endonuclease cleavage to produce blunt ends, or to 
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achieve the same result by filling in the single-stranded termini with an appropriate DNA 
polymerase. 

Alternatively, any site desired may be produced, e.g., by ligating nucleotide 
sequences (linkers) onto the termini. Such linkers may comprise specific oligonucleotide 
5 sequences that define desired restriction sites. Restriction sites can also be generated by the 
use of the polymerase chain reaction (PCR). See, e.g., Saiki et aL, 1988, Science 239:48. 
The cleaved vector and the DNA fragments may also be modified if required by 
homopolymeric tailing. 

The nucleic acids of the invention may be isolated directly from cells. Alternatively, 

1 0 the polymerase chain reaction (PCR) method can be used to produce the nucleic acids of the 
invention, using either chemically synthesized strands or genomic material as templates. 
Primers used for PCR can be synthesized using the sequence information provided herein 
and can further be designed to introduce appropriate new restriction sites, if desirable, to 
facilitate incorporation into a given vector for recombinant expression. 

1 5 The nucleic acids of the present invention may be flanked by natural M. catarrhal is 

regulatory sequences, or may be associated with heterologous sequences, including 
promoters, enhancers, response elements, signal sequences, polyadenylation sequences, 
introns, 5 - and 3 - noncoding regions, and the like. The nucleic acids may also be modified 
by many means known in the art. Non-limiting examples of such modifications include 

20 methylation, "caps", substitution of one or more of the naturally occurring nucleotides with 
an analog, internucleotide modifications such as, for example, those with uncharged linkages 
(e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, carbamates, etc.) and with 
charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.). Nucleic acids may 
contain one or more additional covalently linked moieties, such as, for example, proteins 

25 (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., 
acridine, psoralen, etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, 
etc.), and alkylators. PNAs are also included. The nucleic acid may be derivatized by 
formation of a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage. 
Furthermore, the nucleic acid sequences of the present invention may also be modified with 



-36- 



Applicant's Docket No.: PATH03-14 



a label capable of providing a detectable signal, either directly or indirectly. Exemplary 
labels include radioisotopes, fluorescent molecules, biotin, and the like. 

The invention also provides nucleic acid vectors comprising the disclosed M 
catarrhalis -derived sequences or derivatives or fragments thereof. A large number of 
5 vectors, including plasmid and bacterial vectors, have been described for replication and/or 
expression in a variety of eukaryotic and prokaryotic hosts, and may be used for cloning or 
protein expression. 

The encoded M catarrhalis polypeptides may be expressed by using many known 
vectors, such as pUC plasmids, pET plasmids (Novagen, Inc., Madison, WI), or pRSET or 

1 0 pREP (Invitrogen, San Diego, C A), and many appropriate host cells, using methods 
disclosed or cited herein or otherwise known to those skilled in the relevant art. The 
particular choice of vector/host is not critical to the practice of the invention. 

Recombinant cloning vectors will often include one or more replication systems for 
cloning or expression, one or more markers for selection in the host, e.g. antibiotic 

1 5 resistance, and one or more expression cassettes. The inserted M. catarrhalis coding 
sequences may be synthesized by standard methods, isolated from natural sources, or 
prepared as hybrids, etc. Ligation of the M. catarrhalis coding sequences to transcriptional 
regulatory elements and/or to other amino acid coding sequences may be achieved by known 
methods. Suitable host cells may be transformed/transfected/infected as appropriate by any 

20 suitable method including electroporation, CaC^ mediated DNA uptake, bacterial infection, 
microinjection, microprojectile, or other established methods. 

Appropriate host cells include bacteria, archebacteria, fungi, especially yeast, and 
plant and animal cells, especially mammalian cells. Of particular interest are M catarrhalis 
, E. coli, B. Subtilis, Saccharomyces cerevisiae, Saccharomyces carlsbergensis y 

25 Schizosaccharomyces pombi, SF9 cells, CI 29 cells, 293 cells, Neurospora, and CHO cells, 
COS cells, HeLa cells, and immortalized mammalian myeloid and lymphoid cell lines. 
Preferred replication systems include Ml 3, ColEl, SV40, baculovirus, lambda, adenovirus, 
and the like. A large number of transcription initiation and termination regulatory regions 
have been isolated and shown to be effective in the transcription and translation of 

30 heterologous proteins in the various hosts. Examples of these regions, methods of isolation, 
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manner of manipulation, etc. are known in the art. Under appropriate expression conditions, 
host cells can be used as a source of recombinantly produced M catarrhalis -derived 
peptides and polypeptides. 

Advantageously, vectors may also include a transcription regulatory element (i.e., a 
5 promoter) operably linked to the M. catarrhalis portion. The promoter may optionally 

contain operator portions and/or ribosome binding sites. Non-limiting examples of bacterial 
promoters compatible with £. coli include: b-lactamase (penicillinase) promoter; lactose 
promoter; tryptophan (trp) promoter; araBAD (arabinose) operon promoter; lambda-derived 
Pi promoter and N gene ribosome binding site; and the hybrid tac promoter derived from 

1 0 sequences of the trp and lac UV5 promoters. Non-limiting examples of yeast promoters 
include 3-phosphoglycerate kinase promoter, glyceraldehyde-3 -phosphate dehydrogenase 
(GAPDH) promoter, galactokinase (GAL1) promoter, galactoepimerase promoter, and 
alcohol dehydrogenase (ADH) promoter. Suitable promoters for mammalian cells include 
without limitation viral promoters such as that from Simian Virus 40 (SV40), Rous sarcoma 

15 virus (RSV), adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may 
also require terminator sequences, polyA addition sequences and enhancer sequences to 
increase expression. Sequences which cause amplification of the gene may also be desirable. 
Furthermore, sequences that facilitate secretion of the recombinant product from cells, 
including, but not limited to, bacteria, yeast, and animal cells, such as secretory signal 

20 sequences and/or prohormone pro region sequences, may also be included. These sequences 
are well described in the art. 

Nucleic acids encoding wild-type or variant M catarrhalis -derived polypeptides 
may also be introduced into cells by recombination events. For example, such a sequence 
can be introduced into a cell, and thereby effect homologous recombination at the site of an 

25 endogenous gene or a sequence with substantial identity to the gene. Other recombination- 
based methods such as nonhomologous recombinations or deletion of endogenous genes by 
homologous recombination may also be used. 

The nucleic acids of the present invention find use as templates for the recombinant 
production of M catarrhalis -derived peptides or polypeptides. 

30 
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IDENTIFICATION AND USE OF M CATARRHALIS NUCLEIC ACID SEQUENCES 

The disclosed M. catarrhalis polypeptide and nucleic acid sequences, or other 
sequences that are contained within ORFs, including complete protein-coding sequences, of 
which any of the disclosed M. catarrhalis -specific sequences forms a part, are useful as 
5 target components for diagnosis and/or treatment of M catarrhalis - caused infection 

It will be understood that the sequence of an entire protein-coding sequence of which 
each disclosed nucleic acid sequence forms a part can be isolated and identified based on 
each disclosed sequence. This can be achieved, for example, by using an isolated nucleic 
acid encoding the disclosed sequence, or fragments thereof, to prime a sequencing reaction 

10 with genomic M. catarrhalis DNA as template; this is followed by sequencing the amplified 
product. The isolated nucleic acid encoding the disclosed sequence, or fragments thereof, 
can also be hybridized to M catarrhalis genomic libraries to identify clones containing 
additional complete segments of the protein-coding sequence of which the shorter sequence 
forms a part. Then, the entire protein-coding sequence, or fragments thereof, or nucleic 

1 5 acids encoding all or part of the sequence, or sequence-conservative or function-conservative 
variants thereof, may be employed in practicing the present invention. 

Preferred sequences are those that are useful in diagnostic and/or therapeutic 
applications. Diagnostic applications include without limitation nucleic-acid-based and 
antibody-based methods for detecting bacterial infection. Therapeutic applications include 

20 without limitation vaccines, passive immunotherapy, and drug treatments directed against 
gene products that are both unique to bacteria and essential for growth and/or replication of 
bacteria. 



IDENTIFICATION OF NUCLEIC ACIDS ENCODING VACCINE COMPONENTS AND 
25 TARGETS FOR AGENTS EFFECTIVE AGAINST M CA TARRHALIS 

The disclosed M. catarrhalis genome sequence includes segments that direct the 
synthesis of ribonucleic acids and polypeptides, as well as origins of replication, promoters, 
other types of regulatory sequences, and intergenic nucleic acids. The invention 
encompasses nucleic acids encoding immunogenic components of vaccines and targets for 
30 agents effective against M. catarrhalis . Identification of said immunogenic components 
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involved in the determination of the function of the disclosed sequences, which can be 
achieved using a variety of approaches. Non-limiting examples of these approaches are 
described briefly below. 

5 HOMOLOGY TO KNOWN SEQUENCES: 

Computer-assisted comparison of the disclosed M. catarrhalis sequences with 
previously reported sequences present in publicly available databases is useful for identifying 
functional M catarrhalis nucleic acid and polypeptide sequences. It will be understood that 
protein-coding sequences, for example, may be compared as a whole, and that a high degree 

10 of sequence homology between two proteins (such as, for example, >80-90%) at the amino 
acid level indicates that the two proteins also possess some degree of functional homology, 
such as, for example, among enzymes involved in metabolism, DNA synthesis, or cell wall 
synthesis, and proteins involved in transport, cell division, etc. In addition, many structural 
features of particular protein classes have been identified and correlate with specific 

15 consensus sequences, such as, for example, binding domains for nucleotides, DNA, metal 
ions, and other small molecules; sites for covalent modifications such as phosphorylation, 
acylation, and the like; sites of protein: protein interactions, etc. These consensus sequences 
may be quite short and thus may represent only a fraction of the entire protein-coding 
sequence. Identification of such a feature in an M catarrhalis sequence is therefore useful in 

20 determining the function of the encoded protein and identifying useful targets of antibacterial 
drugs. 

Of particular relevance to the present invention are structural features that are 
common to secretory, transmembrane, and surface proteins, including secretion signal 
peptides and hydrophobic transmembrane domains. M catarrhalis proteins identified as 
25 containing putative signal sequences and/or transmembrane domains are useful as 
immunogenic components of vaccines. 

Targets for therapeutic drugs according to the invention include, but are not limited 
to, polypeptides of the invention, whether unique to M catarrhalis or not, that are essential 
for growth and/or viability of M catarrhalis under at least one growth condition. 
30 Polypeptides essential for growth and/or viability can be determined by examining the effect 
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of deleting and/or disrupting the genes, i.e., by so-called gene "knockout". Alternatively, 
genetic footprinting can be used (Smith et aL, 1995, Proc. Natl. Acad Sci. USA 92:5479- 
6433; Published International Application WO 94/26933; U.S. Patent No. 5,612,180). Still 
other methods for assessing essentiality includes the ability to isolate conditional lethal 
5 mutations in the specific gene (e.g., temperature sensitive mutations). Other useful targets 
for therapeutic drugs, which include polypeptides that are not essential for growth or 
viability per se but lead to loss of viability of the cell, can be used to target therapeutic 
agents to cells. 

10 STRAIN-SPECIFIC SEQUENCES: 

Because of the evolutionary relationship between different M. catarrhalis strains, it is 
believed that the presently disclosed M catarrhalis sequences are useful for identifying, 
and/or discriminating between, previously known and new M. catarrhalis strains. It is 
believed that other M catarrhalis strains will exhibit at least about 70% sequence homology 

1 5 with the presently disclosed sequence. Systematic and routine analyses of DNA sequences 
derived from samples containing M. catarrhalis strains, and comparison with the present 
sequence allows for the identification of sequences that can be used to discriminate between 
strains, as well as those that are common to all M catarrhalis strains. In one embodiment, 
the invention provides nucleic acids, including probes, and peptide and polypeptide 

20 sequences that discriminate between different strains of M catarrhalis . Strain-specific 
components can also be identified functionally by their ability to elicit or react with 
antibodies that selectively recognize one or more M catarrhalis strains. 

In another embodiment, the invention provides nucleic acids, including probes, and 
peptide and polypeptide sequences that are common to all M. catarrhalis strains but are not 

25 found in other bacterial species. 

M CATARRHALIS POLYPEPTIDES 

This invention encompasses isolated M catarrhalis polypeptides encoded by the 
disclosed M catarrhalis genomic sequences, including the polypeptides of the invention 
30 contained in the Sequence Listing. Polypeptides of the invention are preferably at least 
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about 5 amino acid residues in length. Using the DNA sequence information provided 
herein, the amino acid sequences of the polypeptides encompassed by the invention can be 
deduced using methods well-known in the art. It will be understood that the sequence of an 
entire nucleic acid encoding an M catarrhalis polypeptide can be isolated and identified 
5 based on an ORF that encodes only a fragment of the cognate protein-coding region. This 
can be achieved, for example, by using the isolated nucleic acid encoding the ORF, or 
fragments thereof, to prime a polymerase chain reaction with genomic M catarrhalis DNA 
as template; this is followed by sequencing the amplified product. 

The polypeptides of the present invention, including function-conservative variants 

10 of the disclosed ORFs, may be isolated from wild-type or mutant M catarrhalis cells, or 
from heterologous organisms or cells (including, but not limited to, bacteria, fungi, insect, 
plant, and mammalian cells) including M. catarrhalis into which an M catarrhalis -derived 
protein-coding sequence has been introduced and expressed. Furthermore, the polypeptides 
may be part of recombinant fusion proteins. 

15 M. catarrhalis polypeptides of the invention can be chemically synthesized using 

commercially automated procedures such as those referenced herein , including, without 
limitation, exclusive solid phase synthesis, partial solid phase methods, fragment 
condensation or classical solution synthesis. The polypeptides are preferably prepared by 
solid phase peptide synthesis as described by Merrifield, 1963, J. Am. Chem. Soc. 85:2149. 

20 The synthesis is carried out with amino acids that are protected at the alpha-amino terminus. 
Trifunctional amino acids with labile side-chains are also protected with suitable groups to 
prevent undesired chemical reactions from occurring during the assembly of the 
polypeptides. The alpha-amino protecting group is selectively removed to allow subsequent 
reaction to take place at the amino-terminus. The conditions for the removal of the alpha- 

25 amino protecting group do not remove the side-chain protecting groups. 

Methods for polypeptide purification are well-known in the art, including, without 
limitation, preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase 
HPLC, gel filtration, ion exchange and partition chromatography, and countercurrent 
distribution. For some purposes, it is preferable to produce the polypeptide in a recombinant 

30 system in which the M catarrhalis protein contains an additional sequence tag that 
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facilitates purification, such as, but not limited to, a polyhistidine sequence. The polypeptide 
can then be purified from a crude lysate of the host cell by chromatography on an appropriate 
solid-phase matrix. Alternatively, antibodies produced against an M catarrhalis protein or 
against peptides derived therefrom can be used as purification reagents. Other purification 
5 methods are possible. 

The present invention also encompasses derivatives and homologues of M. 
catarrhalis -encoded polypeptides. For some purposes, nucleic acid sequences encoding the 
peptides may be altered by substitutions, additions, or deletions that provide for functionally 
equivalent molecules, i.e., function-conservative variants. For example, one or more amino 

1 0 acid residues within the sequence can be substituted by another amino acid of similar 
properties, such as, for example, positively charged amino acids (arginine, lysine, and 
histidine); negatively charged amino acids (aspartate and glutamate); polar neutral amino 
acids; and non-polar amino acids. 

The isolated polypeptides may be modified by, for example, phosphorylation, 

1 5 sulfation, acylation, or other protein modifications. They may also be modified with a label 
capable of providing a detectable signal, either directly or indirectly, including, but not 
limited to, radioisotopes and fluorescent compounds. 

To identify M. catarrhalis -derived polypeptides for use in the present invention, 
essentially the complete genomic sequence of a virulent, methicillin-resistant isolate of M 

20 catarrhalis isolate was analyzed. While, in very rare instances, a nucleic acid sequencing 
error may be revealed, resolving a rare sequencing error is well within the art, and such an 
occurrence will not prevent one skilled in the art from practicing the invention. 

Also encompassed are any M. catarrhalis polypeptide sequences that are contained 
within the open reading frames (ORFs), including complete protein-coding sequences, of 

25 which any of SEQ ID NO: 1 - SEQ ID NO: 1920 forms a part. Table 2, which is appended 
herewith and which forms part of the present specification, provides a putative identification 
of the particular function of a polypeptide which is encoded by each ORF, based on the 
homology match (determined by the BLAST algorithm) of the predicted polypeptide with 
known proteins encoded by ORFs in other organisms. As a result, one skilled in the art can 
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use the polypeptides of the present invention for commercial and industrial purposes 
consistent with the type of putative identification of the polypeptide. 

The present invention provides a library of M catarrhalis -derived polypeptide 
sequences, and a corresponding library of nucleic acid sequences encoding the polypeptides, 
5 wherein the polypeptides themselves, or polypeptides contained within ORFs of which they 
form a part, comprise sequences that are contemplated for use as components of vaccines. 
Non-limiting examples of such sequences are listed by SEQ ID NO in Table 2, which is 
appended herewith and which forms part of the present specification. 

The present invention also provides a library of M. catarrhalis -derived polypeptide 

1 0 sequences, and a corresponding library of nucleic acid sequences encoding the polypeptides, 
wherein the polypeptides themselves, or polypeptides contained within ORFs of which they 
form a part, comprise sequences lacking homology to any known prokaryotic or eukaryotic 
sequences. Such libraries provide probes, primers, and markers which can be used to 
diagnose M. catarrhalis infection, including use as markers in epidemiological studies. 

1 5 Non-limiting examples of such sequences are listed by SEQ ID NO in Table 2, which is 
appended hereto and part hereof 

The present invention also provides a library of M. catarrhalis -derived polypeptide 
sequences, and a corresponding library of nucleic acid sequences encoding the polypeptides, 
wherein the polypeptides themselves, or polypeptides contained within ORFs of which they 

20 form a part, comprise targets for therapeutic drugs. 

SPECIFIC EXAMPLE: DETERMINATION OF MORAXELLA PROTEIN ANTIGENS 
FOR ANTIBODY AND VACCINE DEVELOPMENT 

The selection of Moraxella protein antigens for vaccine development can be derived 
25 from the nucleic acids encoding M. catarrhalis polypeptides. First, the ORF's can be 

analyzed for homology to other known exported or membrane proteins and analyzed using 
the discriminant analysis described by Klein, et al. (Klein, P., Kanehsia, M., and DeLisi, C. 
(1985) Biochimica et Biophysica Acta 815, 468-476) for predicting exported and membrane 
proteins. 
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Homology searches can be performed using the BLAST algorithm contained in the 
Wisconsin Sequence Analysis Package (Genetics Computer Group, University Research 
Park, 575 Science Drive, Madison, WI 5371 1) to compare each predicted ORF amino acid 
sequence with all sequences found in the current GenBank, SWISS-PROT and PIR 
5 databases. BLAST searches for local alignments between the ORF and the databank 
sequences and reports a probability score which indicates the probability of finding this 
sequence by chance in the database. ORF's with significant homology (e.g. probabilities 
-6 

lower than 1x10 that the homology is only due to random chance) to membrane or 

exported proteins represent protein antigens for vaccine development. Possible functions 
10 can be provided to M catarrhalis genes based on sequence homology to genes cloned in 
other organisms. 

Discriminant analysis (Klein, et al. supra) can be used to examine the ORF amino 
acid sequences. This algorithm uses the intrinsic information contained in the ORF amino 
acid sequence and compares it to information derived from the properties of known 
1 5 membrane and exported proteins. This comparison predicts which proteins will be exported, 
membrane associated or cytoplasmic. ORF amino acid sequences identified as exported or 
membrane associated by this algorithm are likely protein antigens for vaccine development. 

PRODUCTION OF FRAGMENTS AND ANALOGS OF M CA TARRHALIS NUCLEIC 

20 ACIDS AND POLYPEPTIDES 

Based on the discovery of the M. catarrhalis gene products of the invention provided 
in the Sequence Listing, one skilled in the art can alter the disclosed structure of M. 
catarrhalis genes, e.g., by producing fragments or analogs, and test the newly produced 
structures for activity. Examples of techniques known to those skilled in the relevant art 

25 which allow the production and testing of fragments and analogs are discussed below. 

These, or analogous methods can be used to make and screen libraries of polypeptides, e.g., 
libraries of random peptides or libraries of fragments or analogs of cellular proteins for the 
ability to bind M. catarrhalis polypeptides. Such screens are useful for the identification of 
inhibitors of M. catarrhalis . 
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GENERATION OF FRAGMENTS 

Fragments of a protein can be produced in several ways, e.g., recombinantly, by 
proteolytic digestion, or by chemical synthesis. Internal or terminal fragments of a 
5 polypeptide can be generated by removing one or more nucleotides from one end (for a 
terminal fragment) or both ends (for an internal fragment) of a nucleic acid which encodes 
the polypeptide. Expression of the mutagenized DNA produces polypeptide fragments. 
Digestion with "end-nibbling" endonucleases can thus generate DNAs which encode an array 
of fragments. DNAs which encode fragments of a protein can also be generated by random 
10 shearing, restriction digestion or a combination of the above-discussed methods. 

Fragments can also be chemically synthesized using techniques known in the art such 
as conventional Merrifield solid phase f-Moc or t-Boc chemistry. For example, peptides of 
the present invention may be arbitrarily divided into fragments of desired length with no 
overlap of the fragments, or divided into overlapping fragments of a desired length. 

15 

ALTERATION OF NUCLEIC ACIDS AND POLYPEPTIDES: RANDOM METHODS 

Amino acid sequence variants of a protein can be prepared by random mutagenesis of 
DNA which encodes a protein or a particular domain or region of a protein. Useful methods 
include PCR mutagenesis and saturation mutagenesis. A library of random amino acid 
20 sequence variants can also be generated by the synthesis of a set of degenerate 

oligonucleotide sequences. (Methods for screening proteins in a library of variants are 
elsewhere herein). 

PCR MUTAGENESIS 

25 In PCR mutagenesis, reduced Taq polymerase fidelity is used to introduce random 

mutations into a cloned fragment of DNA (Leung etaL, 1989, Technique 1:11-15). The 
DNA region to be mutagenized is amplified using the polymerase chain reaction (PCR) 
under conditions that reduce the fidelity of DNA synthesis by Taq DNA polymerase, e.g., by 

2+ 

using a dGTP/dATP ratio of five and adding Mn to the PCR reaction. The pool of 
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amplified DNA fragments are inserted into appropriate cloning vectors to provide random 
mutant libraries. 

SATURATION MUTAGENESIS 
5 Saturation mutagenesis allows for the rapid introduction of a large number of single 

base substitutions into cloned DNA fragments (Mayers et al., 1985, Science 229:242). This 
technique includes generation of mutations, e.g., by chemical treatment or irradiation of 
single-stranded DNA in vitro, and synthesis of a complimentary DNA strand. The mutation 
frequency can be modulated by modulating the severity of the treatment, and essentially all 
10 possible base substitutions can be obtained. Because this procedure does not involve a 

genetic selection for mutant fragments both neutral substitutions, as well as those that alter 
function, are obtained. The distribution of point mutations is not biased toward conserved 
sequence elements. 

1 5 DEGENERATE OLIGONUCLEOTIDES 

A library of homologs can also be generated from a set of degenerate oligonucleotide 
sequences. Chemical synthesis of a degenerate sequences can be carried out in an automatic 
DNA synthesizer, and the synthetic genes then ligated into an appropriate expression vector. 
The synthesis of degenerate oligonucleotides is known in the art (see for example, Narang, 

20 SA (1983) Tetrahedron 39:3; Itakura et al. (1981) Recombinant DNA, Proc 3rd Cleveland 
Sympos. Macromolecules, ed. AG Walton, Amsterdam: Elsevier pp273-289; Itakura et al. 
(m4)Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al. (1983) 
Nucleic Acid Res. 1 1 :477. Such techniques have been employed in the directed evolution of 
other proteins (see, for example, Scott et al. (1990) Science 249:386-390; Roberts et al. 

25 (1992) PNAS 89:2429-2433; Devlin et al. (1990) Science 249: 404-406; Cwirla et al. (1990) 
PNAS 87: 6378-6382; as well as U.S. Patents Nos. 5,223,409, 5,198,346, and 5,096,815). 



30 
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ALTERATION OF NUCLEIC ACIDS AND POLYPEPTIDES: METHODS FOR 
DIRECTED MUTAGENESIS 

Non-random or directed, mutagenesis techniques can be used to provide specific 
sequences or mutations in specific regions. These techniques can be used to create variants 
5 which include, e.g., deletions, insertions, or substitutions, of residues of the known amino 
acid sequence of a protein. The sites for mutation can be modified individually or in series, 
e -g- 5 by (1) substituting first with conserved amino acids and then with more radical choices 
depending upon results achieved, (2) deleting the target residue, or (3) inserting residues of 
the same or a different class adjacent to the located site, or combinations of options 1-3. 

10 

ALANINE SCANNING MUTAGENESIS 

Alanine scanning mutagenesis is a useful method for identification of certain residues 
or regions of the desired protein that are preferred locations or domains for mutagenesis, 
Cunningham and Wells (Science 244:1081-1085, 1989). In alanine scanning, a residue or 

15 group of target residues are identified (e.g., charged residues such as Arg, Asp, His, Lys, and 
Glu) and replaced by a neutral or negatively charged amino acid (most preferably alanine or 
polyalanine). Replacement of an amino acid can affect the interaction of the amino acids 
with the surrounding aqueous environment in or outside the cell. Those domains 
demonstrating functional sensitivity to the substitutions are then refined by introducing 

20 further or other variants at or for the sites of substitution. Thus, while the site for 

introducing an amino acid sequence variation is predetermined, the nature of the mutation 
per se need not be predetermined. For example, to optimize the performance of a mutation 
at a given site, alanine scanning or random mutagenesis may be conducted at the target 
codon or region and the expressed desired protein subunit variants are screened for the 

25 optimal combination of desired activity. 

OLIGONUCLEOTIDE-MEDIATED MUTAGENESIS 

Oligonucleotide-mediated mutagenesis is a useful method for preparing substitution, 
deletion, and insertion variants of DNA, see, e.g., Adelman et al, (DNA 2:183, 1983). 
30 Briefly, the desired DNA is altered by hybridizing an oligonucleotide encoding a mutation to 
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a DNA template, where the template is the single-stranded form of a plasmid or 
bacteriophage containing the unaltered or native DNA sequence of the desired protein. After 
hybridization, a DNA polymerase is used to synthesize an entire second complementary 
strand of the template that will thus incorporate the oligonucleotide primer, and will code for 
5 the selected alteration in the desired protein DNA. Generally, oligonucleotides of at least 
about 25 nucleotides in length are used. An optimal oligonucleotide will have 12 to 15 
nucleotides that are completely complementary to the template on either side of the 
nucleotide(s) coding for the mutation. This ensures that the oligonucleotide will hybridize 
properly to the single-stranded DNA template molecule. The oligonucleotides are readily 
10 synthesized using techniques known in the art such as that described by Crea et al. (Proa 
Natl. Acad Scl USA, 75: 5765[1978]). 

CASSETTE MUTAGENESIS 

Another method for preparing variants, cassette mutagenesis, is based on the 
15 technique described by Wells et al. {Gene, 34:3 1 5[1 985]). The starting material is a plasmid 
(or other vector) which includes the protein subunit DNA to be mutated. The codon(s) in the 
protein subunit DNA to be mutated are identified. There must be a unique restriction 
endonuclease site on each side of the identified mutation site(s). If no such restriction sites 
exist, they may be generated using the above-described oligonucleotide-mediated 
20 mutagenesis method to introduce them at appropriate locations in the desired protein subunit 
DNA. After the restriction sites have been introduced into the plasmid, the plasmid is cut at 
these sites to linearize it. A double-stranded oligonucleotide encoding the sequence of the 
DNA between the restriction sites but containing the desired mutation(s) is synthesized using 
standard procedures. The two strands are synthesized separately and then hybridized 
25 together using standard techniques. This double-stranded oligonucleotide is referred to as 
the cassette. This cassette is designed to have 3' and 5' ends that are comparable with the 
ends of the linearized plasmid, such that it can be directly ligated to the plasmid. This 
plasmid now contains the mutated desired protein subunit DNA sequence. 

30 
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COMBINATORIAL MUTAGENESIS 

Combinatorial mutagenesis can also be used to generate mutants (Ladner et al. 5 WO 
88/06630). In this method, the amino acid sequences for a group of homologs or other 
related proteins are aligned, preferably to promote the highest homology possible. All of the 
5 amino acids which appear at a given position of the aligned sequences can be selected to 
create a degenerate set of combinatorial sequences. The variegated library of variants is 
generated by combinatorial mutagenesis at the nucleic acid level, and is encoded by a 
variegated gene library. For example, a mixture of synthetic oligonucleotides can be 
enzymatically ligated into gene sequences such that the degenerate set of potential sequences 
10 are expressible as individual peptides, or alternatively, as a set of larger fusion proteins 
containing the set of degenerate sequences. 

OTHER MODIFICATIONS OF M CA TARRHALIS NUCLEIC ACIDS AND 
POLYPEPTIDES 

15 It is possible to modify the structure of an M catarrhalis polypeptide for such 

purposes as increasing solubility, enhancing stability (e.g., shelf life ex vivo and resistance to 
proteolytic degradation in vivo). A modified M catarrhalis protein or peptide can be 
produced in which the amino acid sequence has been altered, such as by amino acid 
substitution, deletion, or addition as described herein. 

20 An M catarrhalis peptide can also be modified by substitution of cysteine residues 

preferably with alanine, serine, threonine, leucine or glutamic acid residues to minimize 
dimerization via disulfide linkages. In addition, amino acid side chains of fragments of the 
protein of the invention can be chemically modified. Another modification is cyclization of 
the peptide. 

25 In order to enhance stability and/or reactivity, an M. catarrhalis polypeptide can be 

modified to incorporate one or more polymorphisms in the amino acid sequence of the 
protein resulting from any natural allelic variation. Additionally, D-amino acids, non-natural 
amino acids, or non-amino acid analogs can be substituted or added to produce a modified 
protein within the scope of this invention. Furthermore, an M. catarrhalis polypeptide can 

30 be modified using polyethylene glycol (PEG) according to the method of A. Sehon and co- 
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workers (Wie et al. 5 supra) to produce a protein conjugated with PEG. In addition, PEG can 
be added during chemical synthesis of the protein. Other modifications of M. catarrhalis 
proteins include reduction/alkylation (Tarr, Methods of Protein Microcharacterization, J. E. 
Silvered., Humana Press, Clifton NJ 155-194 (1986)); acylation (Tarr, supra); chemical 
5 coupling to an appropriate carrier (Mishell and Shiigi, eds, Selected Methods in Cellular 
Immunology, WH Freeman, San Francisco, CA (1980), U.S. Patent 4,939,239; or mild 
formalin treatment (Marsh, (1971) Int. Arch of Allergy andAppl Immunol, 4J_: 199 - 215). 

To facilitate purification and potentially increase solubility of an M catarrhalis 
protein or peptide, it is possible to add an amino acid fusion moiety to the peptide backbone. 
10 For example, hexa-histidine can be added to the protein for purification by immobilized 
metal ion affinity chromatography (Hochuli, E. et al., (1988) Bio/Technology, 6: 1321 - 
1325). In addition, to facilitate isolation of peptides free of irrelevant sequences, specific 
endoprotease cleavage sites can be introduced between the sequences of the fusion moiety 
and the peptide. 

1 5 To potentially aid proper antigen processing of epitopes within an M. catarrhalis 

polypeptide, canonical protease sensitive sites can be engineered between regions, each 
comprising at least one epitope via recombinant or synthetic methods. For example, charged 
amino acid pairs, such as KK or RR, can be introduced between regions within a protein or 
fragment during recombinant construction thereof. The resulting peptide can be rendered 

20 sensitive to cleavage by cathepsin and/or other trypsin-like enzymes which would generate 
portions of the protein containing one or more epitopes. In addition, such charged amino 
acid residues can result in an increase in the solubility of the peptide. 

PRIMARY METHODS FOR SCREENING POLYPEPTIDES AND ANALOGS 
25 Various techniques are known in the art for screening generated mutant gene 

products. Techniques for screening large gene libraries often include cloning the gene 
library into replicable expression vectors, transforming appropriate cells with the resulting 
library of vectors, and expressing the genes under conditions in which detection of a desired 
activity, e.g., in this case, binding to M. catarrhalis polypeptide or an interacting protein, 
30 facilitates relatively easy isolation of the vector encoding the gene whose product was 
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detected. Each of the techniques described below is amenable to high through-put analysis 
for screening large numbers of sequences created, e.g., by random mutagenesis techniques. 

TWO HYBRID SYSTEMS 
5 Two hybrid assays such as the system described below (as with the other screening 

methods described herein), can be used to identify polypeptides, e.g., fragments or analogs of 
a naturally-occurring M. catarrhalis polypeptide, e.g., of cellular proteins, or of randomly 
generated polypeptides which bind to an M catarrhalis protein. (The M catarrhalis 
domain is used as the bait protein and the library of variants are expressed as prey fusion 
10 proteins.) In an analogous fashion, a two hybrid assay (as with the other screening methods 
described herein), can be used to find polypeptides which bind an M. catarrhalis 
polypeptide. 

DISPLAY LIBRARIES 

15 In one approach to screening assays, the Moraxella peptides are displayed on the 

surface of a cell or viral particle, and the ability of particular cells or viral particles to bind an 
appropriate receptor protein via the displayed product is detected in a "panning assay". For 
example, the gene library can be cloned into the gene for a surface membrane protein of a 
bacterial cell, and the resulting fusion protein detected by panning (Ladner et al., WO 

20 88/06630; Fuchs et al. (1991) Bio/Technology 9:1370-1371; and Goward et al. (1992) TIBS 
18:136-140). In a similar fashion, a detectably labeled ligand can be used to score for 
potentially functional peptide homologs. Fluorescently labeled ligands, e.g., receptors, can 
be used to detect homologs which retain ligand-binding activity. The use of fluorescently 
labeled ligands, allows cells to be visually inspected and separated under a fluorescence 

25 microscope, or, where the morphology of the cell permits, to be separated by a fluorescence- 
activated cell sorter. 

A gene library can be expressed as a fusion protein on the surface of a viral particle. 
For instance, in the filamentous phage system, foreign peptide sequences can be expressed 
on the surface of infectious phage, thereby conferring two significant benefits. First, since 
30 these phage can be applied to affinity matrices at concentrations well over 10 13 phage per 
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milliliter, a large number of phage can be screened at one time. Second, since each 
infectious phage displays a gene product on its surface, if a particular phage is recovered 
from an affinity matrix in low yield, the phage can be amplified by another round of 
infection. The group of almost identical E. coli filamentous phages, Ml 3, fd., and fl, are 
5 most often used in phage display libraries. Either of the phage gill or gVIII coat proteins can 
be used to generate fusion proteins without disrupting the ultimate packaging of the viral 
particle. Foreign epitopes can be expressed at the NH 2 -terminal end of pill and phage 
bearing such epitopes recovered from a large excess of phage lacking this epitope (Ladner et 
al. PCT publication WO 90/02909; Garrard et al., PCT publication WO 92/09690; Marks et 

10 al. (1992) J. Biol Chem. 267:16007-16010; Griffiths et al. (1993) EMBOJ 12:725-734; 
Clackson et al. (1991) Nature 352:624-628; and Barbas et al. (1992) PNAS 89:4457-4461). 

A common approach uses the maltose receptor of E. coli (the outer membrane 
protein, LamB) as a peptide fusion partner (Charbit et al. (1986) EMBO 5, 3029-3037). 
Oligonucleotides have been inserted into plasmids encoding the LamB gene to produce 

1 5 peptides fused into one of the extracellular loops of the protein. These peptides are available 
for binding to ligands, e.g., to antibodies, and can elicit an immune response when the cells 
are administered to animals. Other cell surface proteins, e.g., OmpA (Schorr et al. (1991) 
Vaccines 91, pp. 387-392), PhoE (Agterberg, et al. (1990) Gene 88, 37-45), and PAL (Fuchs 
et al. (1991) Bio/Tech 9, 1369-1372), as well as large bacterial surface structures have served 

20 as vehicles for peptide display. Peptides can be fused to pilin, a protein which polymerizes 
to form the pilus-a conduit for interbacterial exchange of genetic information (Thiry et al. 
(1989) Appl Environ. Microbiol 55, 984-993). Because of its role in interacting with other 
cells, the pilus provides a useful support for the presentation of peptides to the extracellular 
environment. Another large surface structure used for peptide display is the bacterial motive 

25 organ, the flagellum. Fusion of peptides to the subunit protein flagellin offers a dense array 
of many peptide copies on the host cells (Kuwajima et al. (1988) Bio/Tech. 6, 1080-1083). 
Surface proteins of other bacterial species have also served as peptide fusion partners. 
Examples include the Moraxella protein A and the outer membrane IgA protease of 
Neisseria (Hansson et al. (1992) J. Bacteriol 174, 4239-4245 and Klauser et al. (1990) 

30 EMBOJ. 9, 1991-1999). 
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In the filamentous phage systems and the LamB system described above, the physical 
link between the peptide and its encoding DNA occurs by the containment of the DNA 
within a particle (cell or phage) that carries the peptide on its surface. Capturing the peptide 
captures the particle and the DNA within. An alternative scheme uses the DNA-binding 
5 protein Lad to form a link between peptide and DNA (Cull et al (1 992) PNAS USA 
89: 1 865-1 869). This system uses a plasmid containing the LacI gene with an 
oligonucleotide cloning site at its 3 -end. Under the controlled induction by arabinose, a 
Lacl-peptide fusion protein is produced. This fusion retains the natural ability of LacI to 
bind to a short DNA sequence known as LacO operator (LacO). By installing two copies of 

10 LacO on the expression plasmid, the Lacl-peptide fusion binds tightly to the plasmid that 

encoded it. Because the plasmids in each cell contain only a single oligonucleotide sequence 
and each cell expresses only a single peptide sequence, the peptides become specifically and 
stablely associated with the DNA sequence that directed its synthesis. The cells of the 
library are gently lysed and the peptide-DNA complexes are exposed to a matrix of 

1 5 immobilized receptor to recover the complexes containing active peptides. The associated 
plasmid DNA is then reintroduced into cells for amplification and DNA sequencing to 
determine the identity of the peptide ligands. As a demonstration of the practical utility of 
the method, a large random library of dodecapeptides was made and selected on a 
monoclonal antibody raised against the opioid peptide dynorphin B. A cohort of peptides 

20 was recovered, all related by a consensus sequence corresponding to a six-residue portion of 
dynorphin B. (Cull et al. (1992) Proc. Natl Acad. Sci. U.S.A. 89-1869) 

This scheme, sometimes referred to as peptides-on-plasmids, differs in two important 
ways from the phage display methods. First, the peptides are attached to the C-terminus of 
the fusion protein, resulting in the display of the library members as peptides having free 

25 carboxy termini. Both of the filamentous phage coat proteins, pill and pVIII, are anchored to 
the phage through their C-termini, and the guest peptides are placed into the outward- 
extending N-terminal domains. In some designs, the phage-displayed peptides are presented 
right at the amino terminus of the fusion protein. (Cwirla, et al. (1990) Proc. Natl. Acad Sci. 
U.S.A. 87, 6378-6382) A second difference is the set of biological biases affecting the 

30 population of peptides actually present in the libraries. The LacI fusion molecules are 
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confined to the cytoplasm of the host cells. The phage coat fusions are exposed briefly to the 
cytoplasm during translation but are rapidly secreted through the inner membrane into the 
periplasmic compartment, remaining anchored in the membrane by their C-terminal 
hydrophobic domains, with the N-termini, containing the peptides, protruding into the 
5 periplasm while awaiting assembly into phage particles. The peptides in the LacI and phage 
libraries may differ significantly as a result of their exposure to different proteolytic 
activities. The phage coat proteins require transport across the inner membrane and signal 
peptidase processing as a prelude to incorporation into phage. Certain peptides exert a 
deleterious effect on these processes and are underrepresented in the libraries (Gallop et aL 
10 (1994) 1 Med. Chem. 37(9): 1233-125 1 ). These particular biases are not a factor in the LacI 
display system. 

The number of small peptides available in recombinant random libraries is enormous. 

7 9 11 

Libraries of 10 -10 independent clones are routinely prepared. Libraries as large as 10 

recombinants have been created, but this size approaches the practical limit for clone 
1 5 libraries. This limitation in library size occurs at the step of transforming the DN A 

containing randomized segments into the host bacterial cells. To circumvent this limitation, 
an in vitro system based on the display of nascent peptides in polysome complexes has 
recently been developed. This display library method has the potential of producing libraries 
3-6 orders of magnitude larger than the currently available phage/phagemid or plasmid 
20 libraries. Furthermore, the construction of the libraries, expression of the peptides, and 
screening, is done in an entirely cell-free format. 

In one application of this method (Gallop et al. (1994) J. Med. Chem. 37(9):1233- 
1251), a molecular DNA library encoding 10 12 decapeptides was constructed and the library 
expressed in an E. coli S30 in vitro coupled transcription/translation system. Conditions 
25 were chosen to stall the ribosomes on the mRNA, causing the accumulation of a substantial 
proportion of the RNA in polysomes and yielding complexes containing nascent peptides 
still linked to their encoding RNA. The polysomes are sufficiently robust to be affinity 
purified on immobilized receptors in much the same way as the more conventional 
recombinant peptide display libraries are screened. RNA from the bound complexes is 
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recovered, converted to cDNA, and amplified by PCR to produce a template for the next 
round of synthesis and screening. The polysome display method can be coupled to the phage 
display system. Following several rounds of screening, cDNA from the enriched pool of 
polysomes was cloned into a phagemid vector. This vector serves as both a peptide 
5 expression vector, displaying peptides fused to the coat proteins, and as a DNA sequencing 
vector for peptide identification. By expressing the polysome-derived peptides on phage, 
one can either continue the affinity selection procedure in this format or assay the peptides 
on individual clones for binding activity in a phage ELIS A, or for binding specificity in a 
completion phage ELISA (Barret, et al. (1992) Anal. Biochem 204,357-364). To identify the 
1 0 sequences of the active peptides one sequences the DNA produced by the phagemid host. 



SECONDARY SCREENING OF POLYPEPTIDES AND ANALOGS 

The high through-put assays described above can be followed by secondary screens 

in order to identify further biological activities which will, e.g., allow one skilled in the art to 
1 5 differentiate agonists from antagonists. The type of a secondary screen used will depend on 

the desired activity that needs to be tested. For example, an assay can be developed in which 

the ability to inhibit an interaction between a protein of interest and its respective ligand can 

be used to identify antagonists from a group of peptide fragments isolated though one of the 

primary screens described above. 
20 Therefore, methods for generating fragments and analogs and testing them for 

activity are known in the art. Once the core sequence of interest is identified, it is routine for 

one skilled in the art to obtain analogs and fragments. 

PEPTIDE MIMETICS OF M CATARRHALIS POLYPEPTIDES 
25 The invention also provides for reduction of the protein binding domains of the 

subject M catarrhalis polypeptides to generate mimetics, e.g. peptide or non-peptide agents. 
The peptide mimetics are able to disrupt binding of a polypeptide to its counter ligand, e.g., 
in the case of an M catarrhalis polypeptide binding to a naturally occurring ligand. The 
critical residues of a subject M. catarrhalis polypeptide which are involved in molecular 
30 recognition of a polypeptide can be determined and used to generate M catarrhalis -derived 
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peptidomimetics which competitively or noncompetitively inhibit binding of the M 
catarrhalis polypeptide with an interacting polypeptide (see, for example, European patent 
applications EP-4 12,762 A and EP-B31,080A). 

For example, scanning mutagenesis can be used to map the amino acid residues of a 
5 particular M catarrhalis polypeptide involved in binding an interacting polypeptide, 
peptidomimetic compounds (e.g. diazepine or isoquinoline derivatives) can be generated 
which mimic those residues in binding to an interacting polypeptide, and which therefore can 
inhibit binding of an M. catarrhalis polypeptide to an interacting polypeptide and thereby 
interfere with the function of M catarrhalis polypeptide. For instance, non-hydrolyzable 

10 peptide analogs of such residues can be generated using benzodiazepine (e.g., see Freidinger 
et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, 
Netherlands, 1988), azepine (e.g., see Huffman et al. in Peptides: Chemistry and Biology, 
G.R. Marshall ed., ESCOM Publisher: Leiden, Netherlands, 1988), substituted gama lactam 
rings (Garvey et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM 

1 5 Publisher: Leiden, Netherlands, 1 988), keto-methylene pseudopeptides (Ewenson et al. 
(1986) J Med Chem 29:295; and Ewenson et al. in Peptides: Structure and Function 
(Proceedings of the 9th American Peptide Symposium) Pierce Chemical Co. Rockland, IL, 
1985), b-turn dipeptide cores (Nagai et al. (1985) Tetrahedron Lett 26:647; and Sato et al. 
(1986) J Chem Soc Perkin Trans 1:1231), and b-aminoalcohols (Gordon et al. (1985) 

20 Biochem Biophys Res Commun 126:419; and et al. (1986) Biochem Biophys Res Commun 
134:71). 



VACCINE FORMULATIONS FOR M CA TARRHALIS NUCLEIC ACIDS AND 
POLYPEPTIDES 

25 This invention also features vaccine compositions for protection against infection by 

M catarrhalis or for treatment of M. catarrhalis infection. In one embodiment, the vaccine 
compositions contain one or more immunogenic components such as a surface protein from 
M catarrhalis , or portion thereof, and a pharmaceutical^ acceptable carrier. Nucleic acids 
within the scope of the invention are exemplified by the nucleic acids of the invention 

30 contained in the Sequence Listing which encode M catarrhalis surface proteins. Any 
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nucleic acid encoding an immunogenic M. catarrhalis protein, or portion thereof, which is 
capable of expression in a cell, can be used in the present invention. These vaccines have 
therapeutic and prophylactic utilities. 

One aspect of the invention provides a vaccine composition for protection against 
5 infection by M catarrhalis which contains at least one immunogenic fragment of an M. 
catarrhalis protein and a pharmaceutical^ acceptable carrier. Preferred fragments include 
peptides of at least about 10 amino acid residues in length, preferably about 10-20 amino 
acid residues in length, and more preferably about 12-16 amino acid residues in length. 
Immunogenic components of the invention can be obtained, for example, by 

10 screening polypeptides recombinantly produced from the corresponding fragment of the 
nucleic acid encoding the full-length M catarrhalis protein. In addition, fragments can be 
chemically synthesized using techniques known in the art such as conventional Merrifield 
solid phase f-Moc or t-Boc chemistry. 

In one embodiment, immunogenic components are identified by the ability of the 

15 peptide to stimulate T cells. Peptides which stimulate T cells, as determined by, for 

example, T cell proliferation or cytokine secretion are defined herein as comprising at least 
one T cell epitope. T cell epitopes are believed to be involved in initiation and perpetuation 
of the immune response to the protein allergen which is responsible for the clinical 
symptoms of allergy. These T cell epitopes are thought to trigger early events at the level of 

20 the T helper cell by binding to an appropriate HLA molecule on the surface of an antigen 

presenting cell, thereby stimulating the T cell subpopulation with the relevant T cell receptor 
for the epitope. These events lead to T cell proliferation, lymphokine secretion, local 
inflammatory reactions, recruitment of additional immune cells to the site of antigen/T cell 
interaction, and activation of the B cell cascade, leading to the production of antibodies. A T 

25 cell epitope is the basic element, or smallest unit of recognition by a T cell receptor, where 
the epitope comprises amino acids essential to receptor recognition (e.g., approximately 6 or 
7 amino acid residues). Amino acid sequences which mimic those of the T cell epitopes are 
within the scope of this invention. 

Screening immunogenic components can be accomplished using one or more of 

30 several different assays. For example, in vitro, peptide T cell stimulatory activity is assayed 
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by contacting a peptide known or suspected of being immunogenic with an antigen 
presenting cell which presents appropriate MHC molecules in a T cell culture. Presentation 
of an immunogenic M catarrhalis peptide in association with appropriate MHC molecules 
to T cells in conjunction with the necessary co-stimulation has the effect of transmitting a 
5 signal to the T cell that induces the production of increased levels of cytokines, particularly 
of interleukin-2 and interleukin-4. The culture supernatant can be obtained and assayed for 
interleukin-2 or other known cytokines. For example, any one of several conventional assays 
for interleukin-2 can be employed, such as the assay described in Proc. Natl Acad. Sci USA, 
86: 1 333 (1 989) the pertinent portions of which are incorporated herein by reference. A kit 
10 for an assay for the production of interferon is also available from Genzyme Corporation 
(Cambridge, MA). 

Alternatively, a common assay for T cell proliferation entails measuring tritiated 
thymidine incorporation. The proliferation of T cells can be measured in vitro by 

3 

determining the amount of H-labeled thymidine incorporated into the replicating DNA of 

15 cultured cells. Therefore, the rate of DNA synthesis and, in turn, the rate of cell division can 
be quantified. 

Vaccine compositions of the invention containing immunogenic components (e.g., 
M. catarrhalis polypeptide or fragment thereof or nucleic acid encoding an M catarrhalis 
polypeptide or fragment thereof) preferably include a pharmaeeutically acceptable carrier. 

20 The term "pharmaeeutically acceptable carrier" refers to a carrier that does not cause an 
allergic reaction or other untoward effect in patients to whom it is administered. Suitable 
pharmaeeutically acceptable carriers include, for example, one or more of water, saline, 
phosphate buffered saline, dextrose, glycerol, ethanol and the like, as well as combinations 
thereof. Pharmaeeutically acceptable carriers may further comprise minor amounts of 

25 auxiliary substances such as wetting or emulsifying agents, preservatives or buffers, which 
enhance the shelf life or effectiveness of the antibody. For vaccines of the invention 
containing M. catarrhalis polypeptides, the polypeptide is co-administered with a suitable 
adjuvant. 
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It will be apparent to those of skill in the art that the therapeutically effective amount 
of DNA or protein of this invention will depend, inter alia, upon the administration 
schedule, the unit dose of antibody administered, whether the protein or DNA is 
administered in combination with other therapeutic agents, the immune status and health of 
5 the patient, and the therapeutic activity of the particular protein or DNA. 

Vaccine compositions are conventionally administered parenterally, e.g., by injection, 
either subcutaneously or intramuscularly. Methods for intramuscular immunization are 
described by Wolff et al. (1990) Science 247: 1465-1468 and by Sedegah et al. (1994) 
Immunology 9k 9866-9870. Other modes of administration include oral and pulmonary 

1 0 formulations, suppositories, and transdermal applications. Oral immunization is preferred 
over parenteral methods for inducing protection against infection by M. catarrhalis . Cain 
et. al. (1993) Vaccine 11: 637-642. Oral formulations include such normally employed 
excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. 

1 5 The vaccine compositions of the invention can include an adjuvant, including, but 

not limited to aluminum hydroxide; N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr- 
MDP); N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 1 1637, referred to as nor- 
MDP); N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(r-2 , -dipalmitoyl-sn- 
glycero-3-hydroxyphos-phoryloxy)-ethylamine (CGP 19835A, referred to a MTP-PE); R1BI, 

20 which contains three components from bacteria; monophosphoryl lipid A; trehalose 
dimycoloate; cell wall skeleton (MPL + TDM + CWS) in a 2% squalene/Tween 80 
emulsion; and cholera toxin. Others which may be used are non-toxic derivatives of cholera 
toxin, including its B subunit, and/or conjugates or genetically engineered fusions of the M 
catarrhalis polypeptide with cholera toxin or its B subunit, procholeragenoid, fungal 

25 polysaccharides, including schizophyllan, muramyl dipeptide, muramyl dipeptide 

derivatives, phorbol esters, labile toxin of E. coli, non-M catarrhalis bacterial lysates, block 
polymers or saponins. 

Other suitable delivery methods include biodegradable microcapsules or immuno- 
stimulating complexes (ISCOMs), cochleates, or liposomes, genetically engineered 

30 attenuated live vectors such as viruses or bacteria, and recombinant (chimeric) virus-like 
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particles, e.g., bluetongue. The amount of adjuvant employed will depend on the type of 
adjuvant used. For example, when the mucosal adjuvant is cholera toxin, it is suitably used 
in an amount of 5 mg to 50 mg, for example 10 mg to 35 mg. When used in the form of 
microcapsules, the amount used will depend on the amount employed in the matrix of the 
5 microcapsule to achieve the desired dosage. The determination of this amount is within the 
skill of a person of ordinary skill in the art. 

Carrier systems in humans may include enteric release capsules protecting the 
antigen from the acidic environment of the stomach, and including M catarrhalis 
polypeptide in an insoluble form as fusion proteins. Suitable carriers for the vaccines of the 

10 invention are enteric coated capsules and polylactide-glycolide microspheres. Suitable 
diluents are 0.2 N NaHCC>3 and/or saline. 

Vaccines of the invention can be administered as a primary prophylactic agent in 
adults or in children, as a secondary prevention, after successful eradication of M 
catarrhalis in an infected host, or as a therapeutic agent in the aim to induce an immune 

1 5 response in a susceptible host to prevent infection by M. catarrhalis . The vaccines of the 
invention are administered in amounts readily determined by persons of ordinary skill in the 
art. Thus, for adults a suitable dosage will be in the range of 10 mg to 10 g, preferably 10 
mg to 100 mg. A suitable dosage for adults will also be in the range of 5 mg to 500 mg. 
Similar dosage ranges will be applicable for children. Those skilled in the art will recognize 

20 that the optimal dose may be more or less depending upon the patient's body weight, disease, 
the route of administration, and other factors. Those skilled in the art will also recognize 
that appropriate dosage levels can be obtained based on results with known oral vaccines 
such as, for example, a vaccine based on an £. coli lysate (6 mg dose daily up to total of 540 
mg) and with an enterotoxigenic E. coli purified antigen (4 doses of 1 mg) (Schulman et al., 

25 1 Urol. 150:917-921 (1993); Boedecker et al., American Gastroenterological Assoc. 999: A- 
222 (1993)). The number of doses will depend upon the disease, the formulation, and 
efficacy data from clinical trials. Without intending any limitation as to the course of 
treatment, the treatment can be administered over 3 to 8 doses for a primary immunization 
schedule over 1 month (Boedeker, American Gastroenterological Assoc. 888:A-222 (1993)). 
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In a preferred embodiment, a vaccine composition of the invention can be based on a 
killed whole E. coli preparation with an immunogenic fragment of an M catarrhalis protein 
of the invention expressed on its surface or it can be based on an E. coli lysate, wherein the 
killed E. coli acts as a carrier or an adjuvant. 
5 It will be apparent to those skilled in the art that some of the vaccine compositions of 

the invention are useful only for preventing M catarrhalis infection, some are useful only 
for treating M. catarrhalis infection, and some are useful for both preventing and treating M 
catarrhalis infection. In a preferred embodiment, the vaccine composition of the invention 
provides protection against M. catarrhalis infection by stimulating humoral and/or cell- 
10 mediated immunity against M. catarrhalis . It should be understood that amelioration of any 
of the symptoms of M catarrhalis infection is a desirable clinical goal, including a lessening 
of the dosage of medication used to treat M. catarrhalis -caused disease, or an increase in the 
production of antibodies in the serum or mucous of patients. 

1 5 ANTIBODIES REACTIVE WITH M CATARRHALIS POLYPEPTIDES 

The invention also includes antibodies specifically reactive with the subject M 
catarrhalis polypeptide. Anti-protein/anti-peptide antisera or monoclonal antibodies can be 
made by standard protocols (See, for example, Antibodies: A Laboratory Manual ed. by 
Harlow and Lane (Cold Spring Harbor Press: 1988)). A mammal such as a mouse, a hamster 

20 or rabbit can be immunized with an immunogenic form of the peptide. Techniques for 

conferring immunogenicity on a protein or peptide include conjugation to carriers or other 
techniques well known in the art. An immunogenic portion of the subject M catarrhalis 
polypeptide can be administered in the presence of adjuvant. The progress of immunization 
can be monitored by detection of antibody titers in plasma or serum. Standard ELISA or 

25 other immunoassays can be used with the immunogen as antigen to assess the levels of 
antibodies. 

In a preferred embodiment, the subject antibodies are immunospecific for antigenic 
determinants of the M. catarrhalis polypeptides of the invention, e.g. antigenic determinants 
of a polypeptide of the invention contained in the Sequence Listing, or a closely related 
30 human or non-human mammalian homolog (e.g., 90% homologous, more preferably at least 
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about 95% homologous). In yet a further preferred embodiment of the invention, the anti-M 
catarrhalis antibodies do not substantially cross react (i.e., react specifically) with a protein 
which is for example, less than 80% percent homologous to a sequence of the invention 
contained in the Sequence Listing. By "not substantially cross react", it is meant that the 
5 antibody has a binding affinity for a non-homologous protein which is less than 10 percent, 
more preferably less than 5 percent, and even more preferably less than 1 percent, of the 
binding affinity for a protein of the invention contained in the Sequence Listing. In a most 
preferred embodiment, there is no cross-reactivity between bacterial and mammalian 
antigens. 

10 The term antibody as used herein is intended to include fragments thereof which are 

also specifically reactive with M. catarrhalis polypeptides. Antibodies can be fragmented 
using conventional techniques and the fragments screened for utility in the same manner as 

described above for whole antibodies. For example, F(ab')2 fragments can be generated by 

treating antibody with pepsin. The resulting F(ab')2 fragment can be treated to reduce 

1 5 disulfide bridges to produce Fab' fragments. The antibody of the invention is further 

intended to include bispecific and chimeric molecules having an anti-M catarrhalis portion. 

Both monoclonal and polyclonal antibodies (Ab) directed against M catarrhalis 
polypeptides orM catarrhalis polypeptide variants, and antibody fragments such as Fab" 

and F(ab')2 ? can be used to block the action of M. catarrhalis polypeptide and allow the 

20 study of the role of a particular M catarrhalis polypeptide of the invention in aberrant or 

unwanted intracellular signaling, as well as the normal cellular function of the M catarrhalis 
and by microinjection of anti-M catarrhalis polypeptide antibodies of the present invention. 

Antibodies which specifically bind M catarrhalis epitopes can also be used in 
immunohistochemical staining of tissue samples in order to evaluate the abundance and 

25 pattern of expression of M catarrhalis antigens. Anti-M catarrhalis polypeptide antibodies 
can be used diagnostically in immuno-precipitation and immuno-blotting to detect and 
evaluate M catarrhalis levels in tissue or bodily fluid as part of a clinical testing procedure. 
Likewise, the ability to monitor M catarrhalis polypeptide levels in an individual can allow 
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determination of the efficacy of a given treatment regimen for an individual afflicted with 
such a disorder. The level of an M catarrhalis polypeptide can be measured in cells found 
in bodily fluid, such as in urine samples or can be measured in tissue, such as produced by 
gastric biopsy. Diagnostic assays using anti-M catarrhalis antibodies can include, for 
5 example, immunoassays designed to aid in early diagnosis of M catarrhalis infections. The 
present invention can also be used as a method of detecting antibodies contained in samples 
from individuals infected by this bacterium using specific M catarrhalis antigens. 

Another application of anti-M catarrhalis polypeptide antibodies of the invention is 
in the immunological screening of cDNA libraries constructed in expression vectors such as 

10 A-gtl 1, Xgtl8-23, A,ZAP, and AX)RF8. Messenger libraries of this type, having coding 

sequences inserted in the correct reading frame and orientation, can produce fusion proteins. 
For instance, A,gtl 1 will produce fusion proteins whose amino termini consist of B- 
galactosidase amino acid sequences and whose carboxy termini consist of a foreign 
polypeptide. Antigenic epitopes of a subject M catarrhalis polypeptide can then be detected 

15 with antibodies, as, for example, reacting nitrocellulose filters lifted from infected plates 
with anti-M catarrhalis polypeptide antibodies. Phage, scored by this assay, can then be 
isolated from the infected plate. Thus, the presence of M catarrhalis gene homologs can be 
detected and cloned from other species, and alternate isoforms (including splicing variants) 
can be detected and cloned. 

20 

KITS CONTAINING NUCLEIC ACIDS, POLYPEPTIDES OR ANTIBODIES OF THE 
INVENTION 

The nucleic acid, polypeptides and antibodies of the invention can be combined with 
other reagents and articles to form kits. Kits for diagnostic purposes typically comprise the 

25 nucleic acid, polypeptides or antibodies in vials or other suitable vessels. Kits typically 

comprise other reagents for performing hybridization reactions, polymerase chain reactions 
(PCR), or for reconstitution of lyophilized components, such as aqueous media, salts, 
buffers, and the like. Kits may also comprise reagents for sample processing such as 
detergents, chaotropic salts and the like. Kits may also comprise immobilization means such 

30 as particles, supports, wells, dipsticks and the like. Kits may also comprise labeling means 
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such as dyes, developing reagents, radioisotopes, fluorescent agents, luminescent or 
chemiluminescent agents, enzymes, intercalating agents and the like. With the nucleic acid 
and amino acid sequence information provided herein, individuals skilled in art can readily 
assemble kits to serve their particular purpose. Kits further can include instructions for use. 

5 

BIO CHIP TECHNOLOGY 

The nucleic acid sequence of the present invention may be used to detect M. 
catarrhalis or other species of Moraxella acid sequence using bio chip technology. Bio chips 
containing arrays of nucleic acid sequence can also be used to measure expression of genes 

10 of M. catarrhalis or other species of Moraxella. For example, to diagnose a patient with a 
M catarrhalis or other Moraxella infection, a sample from a human or animal can be used 
as a probe on a bio chip containing an array of nucleic acid sequence from the present 
invention. In addition, a sample from a disease state can be compared to a sample from a 
non-disease state which would help identify a gene that is up-regulated or expressed in the 

1 5 disease state. This would provide valuable insight as to the mechanism by which the disease 
manifests. Changes in gene expression can also be used to identify critical pathways 
involved in drug transport or metabolism, and may enable the identification of novel targets 
involved in virulence or host cell interactions involved in maintenance of an infection. 
Procedures using such techniques have been described by Brown et aL, 1995, Science 270: 

20 467-470. 

Bio chips can also be used to monitor the genetic changes of potential therapeutic 
compounds including, deletions, insertions or mismatches. Once the therapeutic is added to 
the patient, changes to the genetic sequence can be evaluated for its efficacy. In addition, the 
nucleic acid sequence of the present invention can be used to determine essential genes in 

25 cell cycling. As described in Iyer et aL, 1999 (Science, 283:83-87 ) genes essential in the 
cell cycle can be identified using bio chips. Furthermore, the present invention provides 
nucleic acid sequence which can be used with bio chip technology to understand regulatory 
networks in bacteria, measure the response to environmental signals or drugs as in drug 
screening, and study virulence induction. (Mons et aL, 1998, Nature Biotechnology, 16: 45- 

30 48. Patents teaching this technology include U.S. Patents 5445934, 5744305, and 5800992. 
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DRUG SCREENING ASSAYS USING M CATARRHALIS POLYPEPTIDES 

By making available purified and recombinant M catarrhalis polypeptides, the 
present invention provides assays which can be used to screen for drugs which are either 
5 agonists or antagonists of the normal cellular function, in this case, of the subject M. 
catarrhalis polypeptides, or of their role in intracellular signaling. Such inhibitors or 
potentiators may be useful as new therapeutic agents to combat M. catarrhalis infections in 
humans. A variety of assay formats will suffice and, in light of the present inventions, will 
be comprehended by the person skilled in the art. 

1 0 In many drug screening programs which test libraries of compounds and natural 

extracts, high throughput assays are desirable in order to maximize the number of 
compounds surveyed in a given period of time. Assays which are performed in cell-free 
systems, such as may be derived with purified or semi-purified proteins, are often preferred 
as "primary" screens in that they can be generated to permit rapid development and relatively 

1 5 easy detection of an alteration in a molecular target which is mediated by a test compound. 
Moreover, the effects of cellular toxicity and/or bioavailability of the test compound can be 
generally ignored in the in vitro system, the assay instead being focused primarily on the 
effect of the drug on the molecular target as may be manifest in an alteration of binding 
affinity with other proteins or change in enzymatic properties of the molecular target. 

20 Accordingly, in an exemplary screening assay of the present invention, the compound of 
interest is contacted with an isolated and purified M. catarrhalis polypeptide. 

Screening assays can be constructed in vitro with a purified M. catarrhalis 
polypeptide or fragment thereof, such as an M. catarrhalis polypeptide having enzymatic 
activity, such that the activity of the polypeptide produces a detectable reaction product. The 

25 efficacy of the compound can be assessed by generating dose response curves from data 

obtained using various concentrations of the test compound. Moreover, a control assay can 
also be performed to provide a baseline for comparison. Suitable products include those 
with distinctive absorption, fluorescence, or chemi-luminescence properties, for example, 
because detection may be easily automated. A variety of synthetic or naturally occurring 

30 compounds can be tested in the assay to identify those which inhibit or potentiate the activity 
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of the M catarrhalis polypeptide. Some of these active compounds may directly, or with 
chemical alterations to promote membrane permeability or solubility, also inhibit or 
potentiate the same activity (e.g., enzymatic activity) in whole, live M. catarrhalis cells. 



5 OVEREXPRESSION ASSAYS 

Overexpression assays are based on the premise that overproduction of a protein 
would lead to a higher level of resistance to compounds that selectively interfere with the 
function of that protein. Overexpression assays may be used to identify compounds that 
interfere with the function of virtually any type of protein, including without limitation 

10 enzymes, receptors, DNA- or RNA-binding proteins, or any proteins that are directly or 
indirectly involved in regulating cell growth. 

Typically, two bacterial strains are constructed. One contains a single copy of the 
gene of interest, and a second contains several copies of the same gene. Identification of 
useful inhibitory compounds of this type of assay is based on a comparison of the activity of 

1 5 a test compound in inhibiting growth and/or viability of the two strains. The method 

involves constructing a nucleic acid vector that directs high level expression of a particular 
target nucleic acid. The vectors are then transformed into host cells in single or multiple 
copies to produce strains that express low to moderate and high levels of protein encoding by 
the target sequence (strain A and B, respectively). Nucleic acid comprising sequences 

20 encoding the target gene can, of course, be directly integrated into the host cell. 

Large numbers of compounds (or crude substances which may contain active 
compounds) are screened for their effect on the growth of the two strains. Agents which 
interfere with an unrelated target equally inhibit the growth of both strains. Agents which 
interfere with the function of the target at high concentration should inhibit the growth of 

25 both strains. It should be possible, however, to titrate out the inhibitory effect of the 

compound in the overexpressing strain. That is, if the compound is affecting the particular 
target that is being tested, it should be possible to inhibit the growth of strain A at a 
concentration of the compound that allows strain B to grow. 

Alternatively, a bacterial strain is constructed that contains the gene of interest under 

30 the control of an inducible promoter. Identification of useful inhibitory agents using this 
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type of assay is based on a comparison of the activity of a test compound in inhibiting 
growth and/or viability of this strain under both inducing and non-inducing conditions. The 
method involves constructing a nucleic acid vector that directs high-level expression of a 
particular target nucleic acid. The vector is then transformed into host cells that are grown 
5 under both non-inducing and inducing conditions (conditions A and B, respectively). 

Large numbers of compounds (or crude substances which may contain active 
compounds) are screened for their effect on growth under these two conditions. Agents that 
interfere with the function of the target should inhibit growth under both conditions. It 
should be possible, however, to titrate out the inhibitory effect of the compound in the 
10 overexpressing strain. That is, if the compound is affecting the particular target that is being 
tested, it should be possible to inhibit growth under condition A at a concentration that 
allows the strain to grow under condition B. 

LIGAND-BINDING ASSAYS 

15 Many of the targets according to the invention have functions that have not yet been 

identified. Ligand-binding assays are useful to identify inhibitor compounds that interfere 
with the function of a particular target, even when that function is unknown. These assays 
are designed to detect binding of test compounds to particular targets. The detection may 
involve direct measurement of binding. Alternatively, indirect indications of binding may 

20 involve stabilization of protein structure or disruption of a biological function. Non-limiting 
examples of useful ligand-binding assays are detailed below. 

A useful method for the detection and isolation of binding proteins is the 
Biomolecular Interaction Assay (BIAcore) system developed by Pharmacia Biosensor and 
described in the manufacturer's protocol (LKB Pharmacia, Sweden). The BIAcore system 

25 uses an affinity purified anti-GST antibody to immobilize GST-fusion proteins onto a sensor 
chip. The sensor utilizes surface plasmon resonance which is an optical phenomenon that 
detects changes in refractive indices. In accordance with the practice of the invention, a 
protein of interest is coated onto a chip and test compounds are passed over the chip. 
Binding is detected by a change in the refractive index (surface plasmon resonance). 
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A different type of ligand-binding assay involves scintillation proximity assays (SPA, 
described in U.S. Patent No. 4,568,649). 

Another type of ligand binding assay, also undergoing development, is based on the 
fact that proteins containing mitochondrial targeting signals are imported into isolated 
5 mitochondria in vitro (Hurt et al, 1985, Embo J. 4:2061-2068; Eilers and Schatz, Nature, 
1986, 322:228-231). In a mitochondrial import assay, expression vectors are constructed in 
which nucleic acids encoding particular target proteins are inserted downstream of sequences 
encoding mitochondrial import signals. The chimeric proteins are synthesized and tested for 
their ability to be imported into isolated mitochondria in the absence and presence of test 

10 compounds. A test compound that binds to the target protein should inhibit its uptake into 
isolated mitochondria in vitro. 

Another ligand-binding assay is the yeast two-hybrid system (Fields and Song, 1989, 
Nature 340:245-246). The yeast two-hybrid system takes advantage of the properties of the 
GAL4 protein of the yeast Saccharomyces cerevisiae. The GAL4 protein is a transcriptional 

1 5 activator required for the expression of genes encoding enzymes of galactose utilization. 
This protein consists of two separable and functionally essential domains: an N-terminal 
domain which binds to specific DNA sequences (UASg); and a C-terminal domain 
containing acidic regions, which is necessary to activate transcription. The native GAL4 
protein, containing both domains, is a potent activator of transcription when yeast are grown 

20 on galactose media. The N-terminal domain binds to DNA in a sequence-specific manner 
but is unable to activate transcription. The C-terminal domain contains the activating 
regions but cannot activate transcription because it fails to be localized to UASq. In the two- 
hybrid system, a system of two hybrid proteins containing parts of GAL4: (1) a GAL4 
DNA-binding domain fused to a protein 'X' and (2) a GAL4 activation region fused to a 

25 protein 'Y\ If X and Y can form a protein-protein complex and reconstitute proximity of the 
GAL4 domains, transcription of a gene regulated by UASg occurs. Creation of two hybrid 
proteins, each containing one of the interacting proteins X and Y, allows the activation 
region of UASg to be brought to its normal site of action. 
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The binding assay described in Fodor et al., 1991, Science 251 : 767-773, which 
involves testing the binding affinity of test compounds for a plurality of defined polymers 
synthesized on a solid substrate, may also be useful. 

Compounds which bind to the polypeptides of the invention are potentially useful as 
5 antibacterial agents for use in therapeutic compositions. 

Pharmaceutical formulations suitable for antibacterial therapy comprise the 
antibacterial agent in conjunction with one or more biologically acceptable carriers. Suitable 
biologically acceptable carriers include, but are not limited to, phosphate-buffered saline, 
saline, deionized water, or the like. Preferred biologically acceptable carriers are 
1 0 physiologically or pharmaceutical^ acceptable carriers. 

The antibacterial compositions include an antibacterial effective amount of active 
agent. Antibacterial effective amounts are those quantities of the antibacterial agents of the 
present invention that afford prophylactic protection against bacterial infections or which 
result in amelioration or cure of an existing bacterial infection. This antibacterial effective 
1 5 amount will depend upon the agent, the location and nature of the infection, and the 

particular host. The amount can be determined by experimentation known in the art, such as 
by establishing a matrix of dosages and frequencies and comparing a group of experimental 
units or subjects to each point in the matrix. 

The antibacterial active agents or compositions can be formed into dosage unit forms, 
20 such as for example, creams, ointments, lotions, powders, liquids, tablets, capsules, 

suppositories, sprays, aerosols or the like. If the antibacterial composition is formulated into 
a dosage unit form, the dosage unit form may contain an antibacterial effective amount of 
active agent. Alternatively, the dosage unit form may include less than such an amount if 
multiple dosage unit forms or multiple dosages are to be used to administer a total dosage of 
25 the active agent. Dosage unit forms can include, in addition, one or more excipient(s), 
diluent(s), disintegrant(s), lubricant(s), plasticizer(s), colorant(s), dosage vehicle(s), 
absorption enhancer(s), stabilizer(s), bactericide(s), or the like. 

For general information concerning formulations, see, e.g., Gilman et al. (eds.), 1990, 
Goodman and Oilman's: The Pharmacological Basis of Therapeutics, 8th ed., Pergamon 
30 Press; and Remington's Pharmaceutical Sciences, 17th ed., 1990, Mack Publishing Co., 
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Easton, PA; Avis et al. (eds.), 1993, Pharmaceutical Dosage Forms: Parenteral 
Medications, Dekker, New York; Lieberman et al (eds.), 1990, Pharmaceutical Dosage 
Forms: Disperse Systems, Dekker, New York. 

The antibacterial agents and compositions of the present invention are useful for 
5 preventing or treating M. catarrhalis infections. Infection prevention methods incorporate a 
prophylactically effective amount of an antibacterial agent or composition. A 
prophylactically effective amount is an amount effective to prevent M catarrhalis infection 
and will depend upon the specific bacterial strain, the agent, and the host. These amounts 
can be determined experimentally by methods known in the art and as described above. 

10 M catarrhalis infection treatment methods incorporate a therapeutically effective 

amount of an antibacterial agent or composition. A therapeutically effective amount is an 
amount sufficient to ameliorate or eliminate the infection. The prophylactically and/or 
therapeutically effective amounts can be administered in one administration or over repeated 
administrations. Therapeutic administration can be followed by prophylactic administration, 

1 5 once the initial bacterial infection has been resolved. 

The antibacterial agents and compositions can be administered topically or 
systemically. Topical application is typically achieved by administration of creams, 
ointments, lotions, or sprays as described above. Systemic administration includes both oral 
and parental routes. Parental routes include, without limitation, subcutaneous, 

20 intramuscular, intraperitoneal, intravenous, transdermal, inhalation and intranasal 
administration. 

EXEMPLIFICATION 

25 CLONING AND SEQUENCING M CATARRHALIS GENOMIC SEQUENCE 

This invention provides nucleotide sequences of the genome of M. catarrhalis which 
thus comprises a DNA sequence library of M catarrhalis genomic DNA. The invention also 
provides nucleotide sequences of two naturally occurring plasmids in M. catarrhalis. The 
detailed description that follows provides nucleotide sequences of M. catarrhalis, and also 

30 describes how the sequences were obtained and how ORFs (Open Reading Frames) and 
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protein-coding sequences can be identified. Also described are methods of using the 
disclosed M. catarrhalis sequences in methods including diagnostic and therapeutic 
applications. Furthermore, the library can be used as a database for identification and 
comparison of medically important sequences in this and other strains of M catarrhalis as 
5 well as other species of Moraxella. 

Chromosomal DNA from strain 98-4362. of M. catarrhalis, was isolated using a 
protocol described by Storrs, et al.(J Bacteriol 173: 4347-4352 (1991). The only exception 
to this protocol was that lysostaphin (120 U/ml) was used instead of lysozyme. The genomic 
DNA prep involved a lysozyme: lysostaphin digestion, sodium dodecyl sulfate lysis, 

10 Proteinase K and RNase treatment, phenol: chloroform extraction, and sodium acetate 
precipitation, followed by the CsCl gradient to remove the plasmid. 

In the construction of both libraries, genomic M. catarrhalis DNA was 
hydrodynamically sheared in an HPLC and then separated on a standard 1% agarose gel. A 
fraction corresponding to 2000-3000 bp in length was excised from the gel and purifed by 

1 5 the GeneClean procedure (Bio 1 0 1 , Inc.). 

The purified DNA fragments were then blunt-ended using T4 DNA polymerase. The 
healed DNA was then ligated to unique BstXI-linker adapters (5 ' -GTCTTC ACC ACGGGG- 
3' and 5 ' -GTGGTGA AG AC-3 ' in 100-1000 fold molar excess). These linkers are 
complimentary to the BstXI-cut pGTC vector, while the overhang is not self-complimentary. 

20 Therefore, the linkers will not concatermerize nor will the cut-vector religate itself easily. 
The linker-adapted inserts were separated from the unincorporated linkers on a 1% agarose 
gel and purified using GeneClean. The linker-adapted inserts were then ligated to BstXl-cuX 
vector to construct a "shotgun" sublclone libraries. 

Only major modifications to the protocols are highlighted. Briefly, the library was 

25 then transformed into DH5a competent cells (Gibco/BRL, DH5a transformation protocol). 
It was assessed by plating onto antibiotic plates containing ampicillin and IPTG/Xgal. The 
plates were incubated overnight at 37°C. Transformants were then used for plating of 
clones and picking for sequencing. The cultures were grown overnight at 37°C. DNA was 
purified using a silica bead DNA preparation (Engelstein, 1996) method. In this manner, 25 

30 |Lig of DNA was obtained per clone. 
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These purified DNA samples were then sequenced using primarily ABI dye- 
terminator chemistry. All subsequent steps were based on sequencing by ABI377 automated 
DNA sequencing methods. The ABI dye terminator sequence reads were run on ABB 77 
machines and the data was transferred to UNIX machines following lane tracking of the 
5 gels. Base calls and quality scores were determined using the program PHRED (Ewing et 
aL, 1998, Genome Res. 8: 175-185; Ewing and Green, 1998, Genome Res. 8: 685-734). 
Reads were assembled using PHRAP (P. Green, Abstracts of DOE Human Genome Program 
Contractor-Grantee Workshop V, Jan. 1996, p. 157) with default program parameters and 
quality scores. 

10 Finishing followed the initial assembly. Missing mates (sequences from clones that 

only gave reads from one end of the Moraxella DNA inserted in the plasmid) were 
identified and sequenced with ABI technology to allow the identification of additional 
overlapping contigs. 

End-sequencing of randomly picked genomic lambda was also performed. 

15 Sequencing of both sides was done for all lambda sequences. The lambda library backbone 
helped to verify the integrity of the assembly and allowed closure of some of the physical 
gaps. Primers for walking off the ends of contigs would be selected using pick_primer ( a 
GTC program) near the ends of the clones to facilitate gap closure. These walks can be 
sequenced using the selected clones and primers. These data are then reassembled with 

20 PHRAP. Additional sequencing using PCR-generated templates and screened and/or 
unscreened lambda templates can be done in addition. 

Additional templates for the physical gaps were obtained through PCR using primers 
designed from the ends of the contigs. These templates were then used in sequencing 
reactions to close the gaps. 

25 Contigs were ordered by aligning identified M catarrhalis genes to the published 

physical maps. Order was confirmed by PCR. The final chromosomal assembly included 
119 contigs. 



-73- 



Applicant's Docket No.: PATH03-14 



To identify M catarrhalis polypeptides the complete genomic sequence of M 
catarrhalis were analyzed essentially as follows: First, all possible stop-to- stop open reading 
frames (ORFs) greater than 1 80 nucleotides in all six reading frames were translated into 
amino acid sequences. Second, the identified ORFs were analyzed for homology to known 
5 (archeabacter, prokaryotic and eukaryotic) protein sequences. Third, the coding potential of 
non-homologous sequences were evaluated with the program GENEMARKTM 
(Borodovsky and Mclninch, 1993, Comp. Chem. 17:123) 

IDENTIFICATION, CLONING AND EXPRESSION OF M. CATARRHALIS NUCLEIC 
10 ACIDS 

Expression and purification of the M. catarrhalis polypeptides of the invention can 
be performed essentially as outlined below. 

To facilitate the cloning, expression and purification of membrane and secreted 
proteins from M. catarrhalis , a gene expression system, such as the pET System (Novagen), 
15 for cloning and expression of recombinant proteins in E. coli, is selected. Also, a DNA 
sequence encoding a peptide tag, the His-Tag, is fused to the 3' end of DNA sequences of 
interest in order to facilitate purification of the recombinant protein products. The 3' end is 
selected for fusion in order to avoid alteration of any 5' terminal signal sequence. 

20 PCR AMPLIFICATION AND CLONING OF NUCLEIC ACIDS CONTAINING ORF'S 
ENCODING ENZYMES 

Nucleic acids chosen (for example, from the nucleic acids set forth in SEQ ID NO: 1 
- SEQ ID NO: 2501 for cloning from the 98-4362. strain of M. catarrhalis and plasmids are 
prepared for amplification cloning by polymerase chain reaction (PCR). Synthetic 

25 oligonucleotide primers specific for the 5 1 and 3 ; ends of open reading frames (ORFs) are 
designed and purchased from GibcoBRL Life Technologies (Gaithersburg, MD, USA). All 
forward primers (specific for the 5 f end of the sequence) are designed to include an Ncol 
cloning site at the extreme 5 7 terminus. These primers are designed to permit initiation of 
protein translation at a methionine residue followed by a valine residue and the coding 
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sequence for the remainder of the native M. catarrhalis DNA sequence. All reverse primers 
(specific for the 3 f end of any M. catarrhalis ORF) include a EcoRI site at the extreme 5 7 
terminus to permit cloning of each M catarrhalis sequence into the reading frame of the 
pET-28b. The pET-28b vector provides sequence encoding an additional 20 carboxy- 
5 terminal amino acids including six histidine residues (at the extreme C -terminus), which 
comprise the His-Tag. 

Genomic DNA or plasmid DNA prepared from the 98-4362. strain of M catarrhalis 
is used as the source of template DNA for PCR amplification reactions (Current Protocols in 
Molecular Biology, John Wiley and Sons, Inc., F. Ausubel et al, eds., 1994). To amplify a 
10 DNA sequence containing an M. catarrhalis ORF, genomic DNA (50 nanograms) is 

introduced into a reaction vial containing 2 mM MgCl2, 1 micromolar synthetic 

oligonucleotide primers (forward and reverse primers) complementary to and flanking a 
defined M catarrhalis ORF, 0.2 mM of each deoxynucleotide triphosphate; dATP, dGTP, 
dCTP, dTTP and 2.5 units of heat stable DNA polymerase (Amplitaq, Roche Molecular 

1 5 Systems, Inc., Branchburg, NJ, USA) in a final volume of 1 00 microliters. 

Upon completion of thermal cycling reactions, each sample of amplified DNA is 
washed and purified using the Qiaquick Spin PCR purification kit (Qiagen, Gaithersburg, 
MD, USA). All amplified DNA samples are subjected to digestion with the restriction 
endonucleases, e.g., Ncol and EcoRI (New England BioLabs, Beverly, MA, USA)(Current 

20 Protocols in Molecular Biology, John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994). 
DNA samples are then subjected to electrophoresis on 1.0 % NuSeive (FMC BioProducts, 
Rockland, ME USA) agarose gels. DNA is visualized by exposure to ethidium bromide and 
long wave uv irradiation. DNA contained in slices isolated from the agarose gel is purified 
using the Bio 101 GeneClean Kit protocol (Bio 101 Vista, CA, USA). 

25 

CLONING OF M CATARRHALIS NUCLEIC ACIDS INTO AN EXPRESSION VECTOR 

The pET-28b vector is prepared for cloning by digestion with restriction 
endonucleases, e.g., Ncol and EcoRI (Current Protocols in Molecular Biology, John Wiley 
and Sons, Inc., F. Ausubel et al., eds., 1994). The pET-28a vector, which encodes a His-Tag 
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that can be fused to the 5 end of an inserted gene, is prepared by digestion with appropriate 
restriction endonucleases. 

Following digestion, DNA inserts are cloned (Current Protocols in Molecular 
Biology, John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994) into the previously 
5 digested pET-28b expression vector. Products of the ligation reaction are then used to 
transform the BL21 strain of E. coli (Current Protocols in Molecular Biology, John Wiley 
and Sons, Inc., F. Ausubel et al., eds., 1994) as described below. 

TRANSFORMATION OF COMPETENT BACTERIA WITH RECOMBINANT 
10 PLASMIDS 

Competent bacteria, E coli strain BL21 or E. coli strain BL21(DE3), are transformed 
with recombinant pET expression plasmids carrying the cloned M. catarrhalis sequences 
according to standard methods (Current Protocols in Molecular, John Wiley and Sons, Inc., 
F. Ausubel et al., eds., 1994). Briefly, 1 microliter of ligation reaction is mixed with 50 

1 5 microliters of electrocompetent cells and subjected to a high voltage pulse, after which, 

samples are incubated in 0.45 milliliters SOC medium (0.5% yeast extract, 2.0 % tryptone, 
10 mM NaCl, 2.5 mM KC1, 10 mM MgC12, 10 mM MgS04 and 20, mM glucose) at 37^C 
with shaking for 1 hour. Samples are then spread on LB agar plates containing 25 
microgram/ml kanamycin sulfate for growth overnight. Transformed colonies of BL2 1 are 

20 then picked and analyzed to evaluate cloned inserts as described below. 

IDENTIFICATION OF RECOMBINANT EXPRESSION VECTORS WITH M 
CATARRHALIS NUCLEIC ACIDS 

Individual BL21 clones transformed with recombinant pET-28b M catarrhalis ORFs 
25 are analyzed by PCR amplification of the cloned inserts using the same forward and reverse 
primers, specific for each M. catarrhalis sequence, that were used in the original PCR 
amplification cloning reactions. Successful amplification verifies the integration of the M 
catarrhalis sequences in the expression vector (Current Protocols in Molecular Biology, 
John Wiley and Sons, Inc., F. Ausubel et al., eds., 1994). 

30 
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ISOLATION AND PREPARATION OF NUCLEIC ACIDS FROM TRANSFORMANTS 

Individual clones of recombinant pET-28b vectors carrying properly cloned M 
catarrhalis ORFs are picked and incubated in 5 mis of LB broth plus 25 microgram/ml 
kanamycin sulfate overnight. The following day plasmid DNA is isolated and purified using 
5 the Qiagen plasmid purification protocol (Qiagen Inc., Chatsworth, CA, USA). 

EXPRESSION OF RECOMBINANT M CATARRHALIS SEQUENCES IN E. COLI 

The pET vector can be propagated in any E. coli K-12 strain e.g. HMS174, HB101, 
JM109, DH5, etc. for the purpose of cloning or plasmid preparation. Hosts for expression 

10 include E. coli strains containing a chromosomal copy of the gene for T7 RNA polymerase. 
These hosts are lysogens of bacteriophage DE3, a lambda derivative that carries the lad 
gene, the lacUVS promoter and the gene for T7 RNA polymerase. T7 RNA polymerase is 
induced by addition of isopropyl-B-D-thiogalactoside (IPTG), and the T7 RNA polymerase 
transcribes any target plasmid, such as pET-28b, carrying its gene of interest. Strains used 

15 include: BL21(DE3) (Studier, F.W., Rosenberg, A.H., Dunn, J.J., and Dubendorff, J.W. 
(1990) Meth. Enzymol. 185, 60-89). 

To express recombinant M catarrhalis sequences, 50 nanograms of plasmid DNA 
isolated as described above is used to transform competent BL21(DE3) bacteria as described 
above (provided by Novagen as part of the pET expression system kit). The lacZ gene (beta- 

20 galactosidase) is expressed in the pET-System as described for the M catarrhalis 

recombinant constructions. Transformed cells are cultured in SOC medium for 1 hour, and 
the culture is then plated on LB plates containing 25 micrograms/ml kanamycin sulfate. The 
following day, bacterial colonies are pooled and grown in LB medium containing kanamycin 
sulfate (25 micrograms/ml) to an optical density at 600 nM of 0.5 to 1 .0 O.D. units, at which 

25 point, 1 millimolar IPTG was added to the culture for 3 hours to induce gene expression of 
the M. catarrhalis recombinant DNA constructions . 

After induction of gene expression with IPTG, bacteria are pelleted by centrifugation 

o 

in a Sorvall RC-3B centrifuge at 3500 x g for 15 minutes at 4 C. Pellets are resuspended in 
50 milliliters of cold 10 mM Tris-HCl, pH 8.0, 0.1 M NaCl and 0.1 mM EDTA (STE 
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buffer). Cells are then centrifuged at 2000 x g for 20 min at 4 C. Wet pellets are weighed 
o 

and frozen at -80 C until ready for protein purification. 

A variety of methodologies known in the art can be utilized to purify the isolated 
proteins. (Current Protocols in Protein Science, John Wiley and Sons, Inc., J. E. Coligan et 
5 al., eds., 1995). For example, the frozen cells may be thawed, resupended in buffer and 
ruptured by several passages through a small volume microfluidizer (Model M- 1 1 OS, 
Microfluidics International Corporation, Newton, MA). The resultant homogenate may be 
centrifuged to yield a clear supernatant (crude extract) and following filtration the crude 
extract may be fractionated over columns. Fractions may be monitored by absorbance at 

1 0 OD280 nm - an( i P ea k fractions may analyzed by SDS-PAGE 

The concentrations of purified protein preparations may be quantified 
spectrophotometrically using absorbance coefficients calculated from amino acid content 
(Perkins, S.J. 1986 Eur. J. Biochem. 157, 169-180). Protein concentrations are also 
measured by the method of Bradford, M.M. (1976) Anal. Biochem. 72, 248-254, and Lowry, 
15 O.H., Rosebrough, N., Farr, A.L. & Randall, R.J. (1951) J. Biol. Chem. 193, pages 265-275, 
using bovine serum albumin as a standard. 

SDS-polyacrylamide gels of various concentrations may be purchased from BioRad 
(Hercules, CA, USA), and stained with Coomassie blue. Molecular weight markers may 
include rabbit skeletal muscle myosin (200 kDa), £. coli (-galactosidase (116 kDa), rabbit 
20 muscle phosphorylase B (97.4 kDa), bovine serum albumin (66.2 kDa), ovalbumin (45 kDa), 
bovine carbonic anhydrase (31 kDa), soybean trypsin inhibitor (21 .5 kDa), egg white 
lysozyme (14.4 kDa) and bovine aprotinin (6.5 kDa). 



25 EQUIVALENTS 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, many equivalents to the specific embodiments and methods 
described herein. The specific embodiments described herein are offered by way of example 
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only, and the invention is to limited only by the terms of the appended claims, along with the 
full scope of equivalents to which such claims are entitled. 
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20 


. 1940 


207 


624 541 


4.ie-52 


Protein name 








Locus Name 


Acc# 



sp : YCEG_HAEIN 



P44720 



Description 
HYPO T HETICAL PROTUIN HI04bV 



83 



ORF Name 



NTID AAID 



3065S5S7 11 2 



NT .. . AA 
Length Length 
71— 



Score 



Probability 
4 . 8e-08 ' 



Protein name 



Description 



Locus Name • 



sprKTHYBACSU 



Acc# 
P37537 



THYMIDILATE KINASE, (DTMP KINASE); - 


ORF Name NTID AAID 


NT 
Length 


AA 
Length 


Score . 


Probability 


p767080_tl_2. |22 1942 | 


1373 


1122 




1522 | 


4 . 6e 


-lb6 


Protein, name 






Locus Name 




Acc# 








sp: EF-TU 


_SHEPU 




P33169 


Description 1 . 
















ELONGATION FACTOR TU (EF-TU) 


■ ORF Name N NTID AAID 


NT 
Length 


AA ., 
Length 


Score 


Probability 


32il0007_c2_B t • 22 1943 




. 267 




114 


7.3e- 


-07 


Protein name 






Locus Name 




Acc# 


hypothetical protein PH1485. 




pir :H71023 




H71023 


Description , ... 
















' .ORF Name : ' ' NTID • AAID 


NT , . 
Length 


AA 
Length 


Score 


Probability 


36329582 cl 5 24 ; • 1944 


60 


I 185 


: 


144 


5.5e- 


-09 


Protein name • 


i. 




Locus Name 




Acc# 








sp:YHA2 


JEIKCO 




P35649 


Description 
















HYPOTHETICAL 66.3 Kb PROTEIN IN HAG2 


5 'REGION , 












ORF Name " NT I'D AAID, 


NT 

Length 


AA 
Length 


. Score 


Probability 


|97i016_ti_i - • | 25 1945 , 


|198 


597 




643 | 


6 ,4e- 


-6.3 : " " 



Protein name 

Description 
ELONGATION PACWOfe. 0 (EE-G) 



Locus Name 



sp:EFGJHELPY 



Acc# 
P56002 



8.4 



ORF Name 



NT ID 



AAID 



\ NT 
Length ' 



222b312 12 19 



1946 I WTT 



AA 
Length ' 
"11284 



Score 



Probability 
1.9e-64 - — " 



Protein name 



Locus Name 



glycerophospnpryl- diester phosphodiesterase 



if |pir:D75630 



Acc# 
D75630 



Description 
ORF Name 



NTID , AAID 



23457692' 11 1 



~7T 



NT 
Length 
TWT 



AA 
Length 
[TT7T 



Score Probability , 
pg'o | |1.9e-42 



Protein name 

Description 
RECF PROTEIN 



Locus Name 



|sp:RECFJ?SEPU 



Acc.# 
P13456 



ORF Name 



NTID 



AAID 



26042927 13 19 ■ 



NT 

Length 
84 



— . , Score Probability 



AA 
Length 

[2^ 



Protein name 



Description 



Locus Name 



Acc# 



MO-MIT ; . 


ORF. Name 1 


NTID 


AAID 


NT 
Length 


AA " 5 

~ \ , . Score 
Length • • 


Probability 


26750837_li_4 


: 29 


. . 1949 


111 . .; 


.336 ' |202 ; | 


|4.9e-16 


Protein name 








Locus .Name 


Acc# 


hypothetical 


protein 






pir:S76551 


S76551 


Description 












ORF Name 


NTID 


AAID 


NT- 
Length 


, AA' ; ■ 
— • » Score 
Length 


Probability 


3<?144675_11_2 


30 


19.50 


525 ~ 


1578 • |185i | 


|6.3e-191 , | 



Protein name \ ■ 

Description > ^ 

AMIDOTRANSFERASE) (GMP SYNTHETASE) 



Locus Name 



sp:OTAA_HAEIN 



Acc# 
P.443 3 5 



■85 



ORF. Name 



NTID AAID 



4298443 r2 8 



TT" 



T£5T~ 



NT AA 
Length Length 
822 | ' 12469" 



Score 



12597 



Probability 
[5 .6e-270 ■ 



Protein name 



Description 



Locus Name 



'sp:GYRB_ECOLI- 



Acc# 

P06 982:008,' 
438 



dna GYRASE 


stounit B, 












ORF. Name 


NTID 


• AAID 


NT 
Length 


AA 
Length , 


Score 


Probability 


i2<5i,7627_ci_ 


_i 32 „ 


1952 


128 


387 


r° i 


|i.2e-63 



Protein name , 



Locus Name 



transposase , 



pxr:±67760 



Acc# 
167760 



Description 



ORF Name 



134175180 c2 2 



Protein name 



NTID AAID 



TT 



NT AA 
Length .. Length 
190 



Score 



TTT 



Probability 
|1.7e-08 — 



Locus Name 



transposase- 



gp:AB026428 



■• Acc# 
AB02 642 8 



Description 



Methylomonas aminotaciens ribulose monopnosphate pathway genes ( rmpD , • rmpA , 
IS10-R rmpl , rmpB) , complete cds . ■ . ' ■ "■ ' '•' * 



ORF, Name 


NTID 


AAID 


•NT 
Length 


— Score 
Length 


Probability. 


16S90875_ti_2 


34 


1954 | 


82 


| ■ 249 | |90 


|0. 00026 


Protein name ' - 








f . . 
Locus Name 




Acc#" 


TolR protein 








Igp : PPPAL1 ;■; 




X74218 


Description ; ' ' ':. . . • 




Pseudomonas putida 


ruvB, 


tolQ, .tolR, 


bo±A, 


tolB and oprL genes . 




ORF Name ■ 


NTID 


AAID 


NT' 
Length 


AA ' r. 

— , Score 
Length 


Probability 


I9b3953_c2_15 • 


35 


■ 1955 


534,, 


1605 | |1387 


,|?.3e- 


-142 


Protein name 








Locus Name 




Acc#. 



Description 



sp : ANIA_NEIGO 



Q02219 



MAJOR OUTER MEMBRANE .PROTEIN PAN 1 PRECURSOR' 



ORF Name 



NT ID 



AAID 



22667557 ±2 6 



NT AA 
Length Length 
177 . 



Score 



"BIT" 



Probability 
4.5e-!il 



Protein name » 



Locus Name 



Acc# 
007573 



Description . 
HYPOTHETICAL 16. 6 KD PROTEIN IN GLPD-SPOVR 1NTERGENIC REGION 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


|J06442i7_t2J» 


57 


| |19B7 


i r i 








Protein name 








Locus 


Name 


* Acc# 


Description 














NO-HIT . . r . • " 


ORF Name .-: 


NTID 


AAID 


. NT- 
Length . 


AA 
Length 


Score 


Probability . \ 


488i533_r2_7 ' 


3 8. 


■ ,1558 


1 ^ 


- 189 






Protein ' name '" , 


• ■ - (-■ , _ 






Locus 


Name 


Acc# ■ 


Description 














WO-HIT . •• • ■ ' • . • • • •• 


ORF Name ' ,' 


NtlD 


AAID 


NT 

Length 


AA, 
Length 


Score 


Probability 


|6651712_t2_2 


3.9 ■ 


| 1359 


. 271 


815 


611 


1.6e-59 



Protein name 1 



Locus Name 



isocitrate lyase 



|gp:AB004Sbi 



Acc# 
■AB004651 



Description , 



Hyphomicrobium methylovorum gene tor isocitrate lyase, inorganicphosphate 
transporter , methionine synthase ; complete and partial cds . 


ORF Name NTID 


NT * ' AA :i ' ' . 1 '■' 
AAID , — . , , ^— Score Probability . 

Length LeiiyLh — *- 


14647952_tl_i 40 


1960 912 . 2739 | 


2108 | |3 . 7e-218 | 



Protein name 



Locus Name 



initiation tactor IF2 -alpha 



gp: &VAJ2737 



Acc# 
AJ002737 



Description 

| Proteus vulgaris IntB gene and partial nusA and. rbtA genes . 



ORF Name 



NTID AAID 



115032818 cl 15 



NT 
Length . 
T7T— I 



AA 

T — • Score Probability 
Length — — — — 2 - 



[112 | |4.1e-0S 



Protein. name 



Logus Name 



nypothetical protein 



pir:G75410 — " 



Acc# 
G75410 



Description 



ORF Name 



NTID ' AAID 



21644075 cl 14 



NT 
Length 
199- 



AA 
Length 
1600 



Score Probability 



3.7e-^ 



Protein name 



Locus Name 



conserved, hypothetical protein 



pir:P754-10 



Acc# 
F75410 



. Description . 



ORF Name 



NTID 



AAID 



24550277 11. 3 



n 




Score 



Probability 
2.5e-52 r 



Protein name 

Description 
' HVDR0LYA5E) 



Locus Name 



. sp:TRUB_HAEIN 



Acc# 
P4 514'2 



ORF Name - ' 


NTID . 


AAID 


NT 
. Length 


AA - 

i — ' Score 
Length 


' -Probability 


|3332760_t2_li- 1 


II 44 


|1964 




186 , | 




Protein name 








Locus Name 


Acc# ; " 


Description 












NO-HIT ' ■ ■ , - ; 


, ORF Name 


, NTID 


AAID 


NT 
Length 


AA ■ , . , 
_ ■■— . ,.. Score 
Length 


. Probability 


p.407812_12_9., ,. 


l l 4y - 




|i<58 


[507 |. |2i5 | 


|1.4e-17 


Protein name ,. 








Locus Name. 


, ■ . Acc# . • ; 



sp:RBPA_ECOLI 



P09170 



Description . . 
RIBOSOME-BIMDIMG- FACTOR A (PI SB PROTEIN) 



88 



ORF Name 



NTID AAID 



4573462 c2 24 



1966 



NT AA 
Length Length 
103 



ITT" 



Score Probability 
|171 | [ 2.0e-12 ~ 



Protein name 



Locus Name 



conserved hypothetical protein 



pir :F75410. 



Acc# 
F75410 



Description 



ORF Name 



NTID AAID 



4968825 i'2 5 



. NT AA 
Length Length 
217 



Score 



654 



Probability 
^.Ve-44 — 



Protein name 

Description ' * 
N UTILISATION SUBSTANCE PROTEIN A 



Locus Name 



sp:NUSA_ECOLi 



Acc# 

P03003 

'i 



(NUSA- PROTUIN) (L F ACTOR) 



ORF Name 


. : NTID 


AAID 


NT : 
. Length 


AA 

, — . , Score 
Length 


Probability 


7070265_il_4 


| 48 


1968 


62 | 


1189. 1 




Protein name 










Locus Name 


Acc#. 


Description . 














NO-HIT • ■ , • 


. ORF Name 


■ NTID 


AAiD 


' ' MI 

Length 


AA , 
, — ■ Score 
Length 


Probability 




43 


| 1969 




957 | 164 


i.le-ii • • . | 


Protein name 










Locus Name • 


Acc# : 


■ nypotnetical protein b!759 




pir:G64935 


G64935 


Description 














ORF Name 


NTID , 


AAID 


NT 

Length 


AA 

+ — . , Score 
Length 


Probability v . • 


10729.52 Jr3_19 . 


1 50 


1970 . 


1 in 1 


996 |281 [ 


[2.5e-24 | 


Protein name 










Locus Name » 


Acc# 



Description 



sp : SUG2__YEAST 



P53549 :Q08 
718 



PROBABLE 26S PROTEASE " SUBUNIT SUG2 ( PROTEASOMAL CAP' SUBUNIT) 



ORF Name 



NTID AAID 



1112580 tl 4 



ST" 



T57T 



NT , 
Length 
199 ' 



AA 
Length 

mrc — 



Score Probability 



1.7e-07 



Protein name 



Locus Name' 



hypothetical protein APE2554 



pir :C72489 



Acc# 
C724 8 9 



Description 
ORF Name 



NTID AAID 



1972 



NT 
Length 
167 : 



AA • ■ 
— , Score 
Length — 7 " 



Protein name 
Description 

INO-HTT " 



Locus Name 



Probability 



Acc# 



ORF Name 
195327Q2 c2 



•NTID AAID 



T57T 



NT ; 
Length 
513 I 



AA ' 
Length 
[1542 



Score Probability 
[1454 | |7..4e-149 — 



Protein .name . , ■ ; 

Description - 
AHTHRANILATE SYNTHASE COMPONENT I, 



Locus Name 



spiTRPE^ACICA. 



Acc# 
P23315 



ORF Name 


NTID 


AAID .-' 


NT 
, Length., 


AA ■• • 

, Score 
> Length 


. Probability 


20939567_ti_l 


54 


.1974 


:138 


417 




Protein name • 








Locus Name 


; V. . Acc# 


Description 












NO-HIT • • ■ 












ORF Name 


NTID 


AAID 


' NT ' 
Length 


AA 

— . , Score 
.Length - - ■ 


Probability 


220709S5_i2_ll 


|!, 5 . 


197S 


| 1214 


p72 • | 


|6.018 



Protein name 



Locus Name 



alanine- -tRNA ligase, alaS: alanyl -tRNA 
synthetase : alanyl -tRNA synthetase 



|pir:D70127 



Acc# 
'D70127 



Description 



ORF Name 


NT 

NTID .' AAID * _ t , ■ 

— : — — Length 


AA 
Length 


Score 


Probability 


23839667_cl_^b | 


56 1976 318 


957 


732 


2.4e 


-12. 


Protein name 






Locus 


Name 




Acc# 








sp : DApA_HAEIN 


P43797 


Description' 














D I H YDROD I P I COL I NAT E 


SYNTHASE, (DHDPS) 












ORF Name 


NT 

NTID AAID , — , 
■; Length 


AA 
Length 


Score 


. Probability 




|57 1977 . | J119 


360- ] 


pv; | 


|5.1e 


-22 


Protein name 






Locus 


Name 




Acc# 



sp :Y01B_MYCTU 



Q10514 



Description . 
HYPOTHETICAL 39.; 6 KD PROTEIN CY42>7.11C 



ORF Name 
30507291 TT ^0 



NTID AAID 



T9T5" 



NT 
Length 
174 . 



AA. 1 ' 
Length 
1525 



Score Probability 



Protein name . 
Description 



Locus Name' 



Acc# 



INO-HIT 



ORF Name . ' 


. NTID 


AAID 


■ NT . 
Length 


; aa > 

Length 


Score 


Probability 


4792250__ci_26 - - 


59 


1979 


114 .- 


345 






Protein name 








Locus Name 


Accft 


Description 














N0-H±T, :> - . . .,- ,-■ • 


ORF Name 


NTID 


, AAID ' 


NT . 
Length 


.» M • • 

Length. 


Score 


• Probability 


5282805_c3^4 




1980 


^41 




I 786 1 


|4.5e-78 | 


Protein name 








Locus Name 


Aec# . 



sp : PU£7_EC0LI 



P21155 



Description 
(SAICAR SYNTHETASE) 



ORF Name 


NT I D 


AAID 


"NTT 

Length 


AA 

t " — Score 
Length 


Probability *. 


2402043(M:2_I ' 


si ■ 


1981 


127 


J81 . [549 | 


|1.5e 




Protein name 










Locus Name 




ACC# 


transposase . 




pir:i67760 




167760 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

' — ^ Score 
Length 


Probability, 


129813_r2 1 


62 


1982 


, 126 


381 






Protein name : 










Locus Name 




Acc# 


Description ; 
















NO-HIT ■-. , - 




ORF Name 


' NTID 


AAID 


NT 

Length' 


AA • 
T ■— . , Score 
Length 


Probability 


4391518_t2_4 . ■ 




1983' 


, ^4 


|195 |108 | 


|3.2e 


-06 - 


Protein name , ; 










Locus Name 




Acc# 












sp:.THIX_HAETN 




P43.787 


Description 
















TH I ORED0X IN - L 1 Ktl 


. PROTEIN 


HI1115 














■ORF Name 


NTID 


AAID 


NT • 
Length 


AA 

-y — , ' Score 
Length 


Probability 


4495268_i2 2 , :. 


I 64 


| ' 1984 


110 . | 


p.3" | 512 | 


|4 . 9 e 


-49 


Protein name 










Locus Name 




■Acc# 


terredoxin L3Fe-4S . 


pir : FEAV 






Description 














, A29936 :A00 
218 . 

■ • ■ i, 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

^- Score 
Length . - ■ 


Probability 




65 


1985' 


159 


480 |204 | 


|2.1e 


-16 


Protein -name 










Locus Name 




A'cc# ' 


"■Hypothetical, protein APE2447 




pir:F724Vb 


F72475 



Description 



92 



ORF Name 



15577200 n 2 



NTID 
\G6 



AAID 



NT 
in 
TOT 



AA 

t Score 
Length Length , ~ 



Probability 
13^96-40 ~ 



Protein name/ 



Description 



Locus Name 



sp:OTSW_ECOLI 



Acc# 

P16702 :P76 
534 



SULFATE TRANSPORT; SYSTEM 


PERMEASE 


PROTEIN CYSW " 






ORF Name ' " NT ID 


AAID 


NT ' 
' Length 


AA ' 
Length 


Score 


Probability 




1987 


247 


741 


545 


6.4e-63 1 



Protein name , 

Description ... ' ,. ( ; 

SULFATE TRANSPORT ATP-BlNDING PROTE IN OTSA, 



' Locus Name 



sp:CYSA_EC!OLI 



• Accfr 

P16676 :P77 
693 



- ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


16054077_t3_2(V 




1988 | 


^0 . | 


; iS£3 . -v : 




Protein name 








, Locus Name 


Acc# 


Description-. 












no-hit ; \ ;■■ \, . .. . . — ■. | 


ORF. Name 


NTID 


AAID 


NT 

Length 


AA 

_■ — ■ Score 
Length 


Probability 


i649b465_ll 1 


69/ 


1989 


77 


' 234 „;|72 | 


|0.020 



Protein name 



Locus Name 



sp:YDIE_ECOLI 



Acc# 1 
P4 0721-, 



Description ; - 
HYPOTHETICAL 7.1 KD PROTEIN IN AROH-NLPC TNTERGENIC REGION 



ORF Name 



NTID 



AAID 



23485750 c3 36 



TO" 



1990 



NT AA 
Length Length 
— I 1207 



Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



[NO-HIT 



93 



ORF Name 



NTID 



AAID 



23730017 cl 24 



7T 



NT AA 
Length Length 
947 . I' 12844-', 



Score 



Probability 
2.2e-36 



Protein name 



Description 



Locus Name 



sp : YTPMJHAEIN 



, Acc# 
P44038 



HYPOTHETICAL PROTEIN HI06 98 PRECURSOR 


ORF. Name ., NTID AAID 


nt ; 

Length 


AA 

— Score ' 
Length 


Probability 


23859387_t2_I4 j 72 1992 


1 P % 


. 891 




|0.048 | 


Protein name 




Locus Name 


Acc# 


conserved hypothetical protein yrrB . 


p±r:'H69978 ~ 


H69978 


Description 










ORF Name . NTID AAID 


NT 

' ■ Length 


AA 

— , jti Score 
Length. , \ 


Probability 


34119002 13_18 / 73 1993 


444. 


1335 


714 ; 


l.le-69, 


Protein name, 




Locus Name 


Acc# 


2 -acylglycerophosphoethanolamine '. 
acyltransf erase (aas) RP620 


pir:E71657 , 


E7166 7 








Description . 










ORF Name ■ -NTID;. ' AAID 


NT 
Length ' 


AA '' ' ' 
, - — • Score 
Length 


Probability . 


4480217_c3_35 • 74 1994. 


v 1675 / 


5028 . 


578 


|1.5e-79 . 


Protein name 




Locus Name , 


. Acc# 






sp : YTFN 


HAEIN 


Q57523 


Description 










HYPOTHETICAL PROTEIN HI 06 96 


ORF Name' " : NTID AAID 


NT . 
Length . 


AA 

, — . , Score 
Length 


Probability" 


12378407_c2_32 75 ■ 1995.. 


. 278 


.834 , 


626 ■ 


4.1e-61 


Protein name 




Locus Name 


1 Acc# 



sp : PDXJ_ECOLI 



P24223 



Description 

t>YRIt)OXAL PHOSPHATE BTSSYNTHETTC PR0TE2N PDXJ 



94 



ORF Name 


NTID - 


, AAID 


NT 
Length 


AA 

T — -. i Score 
Length 


Probability 


i4487952_ti_7 






1 P ' 


,219 | 




Protein name 








Locus Name 


' Acc# 


Description 












NO-HIT 


ORF Name . ■ - 


NTID 


AAID , 


NT 

Length 


AA 

. — . , : Score 
Length 


Probability 


161402 c2_29 


77 - 


| 1997 




186 |59 | 


|0 - 0i8 


Protein name 








Locus Name 


' Acc#' 


envelope glycoprotein 




•j. 


gp :HIVU90070 


U90070 ' 


Description 












HIV-i strain vTtfiS 
partial cds'. 


trom Vietnam, envelope . glycoprotein V3 region (env) gene, 


ORF Name - • 


NTID ' 


AAID 


NT 
. Length 


AA . ■ 
— , Score 
Length - ■ 


Probability 


16171905_c2_28 , 


78 


' 1998 


67 


204 | 


■ ■ i 


Protein name 








Locus Name 


' ■■"* Acc#. 


Description 












NO-HIT - . ■ , 


: ORF Name 


NTID 


AAID 


NT 
Length 


'■ ^ . 

— , Score - 
Length 


.. Probability 


223243Jl_12_i6 < 


79 


; 1999 


.77 


P 34 1 




Protein name/ 








Locus Name 


' Acc# 


Description 












NO-HIT •;, -' 


ORF Name 


NTID 


: AAID 


■NT • 
Length 


AA • 

— . , Score 
Length 


Probability 


22463 : 311_t3_22 


1 80 


2000 


1 103 . 1 


312 • 




Protein name . • 








• Locus Name 


Acc# . 



Description . 
NO-HIT 



95 



ORF Name 


NTID 


AAID 


Nt 

1M J. 

Length 


AA 

T ~ ± 1, . Score 
Length 


Probability 


23442503_c2_3i 


51 , 


2001 ' 


346 


1041 831 


7.7e 


- o J 


Protein name 










Locus Name 




Acc# 


Era -■ 




gp:AF123492 


AF123492 


Description 














Pseudomonas aeruginosa rnc 


-era-recO 


operon, complete sequence 






ORF Name 


NTID . 


AAID 


NT 
Length 


AA 

T . — L1 Score 
Length 


Probability 


244i278i_c3_"i4 


|82 


|. 2002 j 


I 101 1 


P 06 1 






Protein name 










Locus Name 




Acc# . 


Description 
















NO-HIT . 


,., ORF Name 


NTID 


AAID 


NT . '• 
Length 


AA 

, — , , Score 
Length 


Probability 


265705>25_c2_30 


" 8 3 


2003 


268 


|807 | |500 | 


|9.1e 


-48 


Protein name 










Locus Name 




Acc# 



■ Description 



sp:RNC_ECOLl 



P05797 :P06 
141 ■ 



. RiBOKrUCLEASE III, 


(kMA^E ■ 


III) -v 












ORF Name 


NTID 


AAID 


NT 
. Length 


AA 
Length 


Score 


Probability 


26678567_g1_j24 - 


84; 


2004; 


• 6.3 ; 




192 


88 1 


,0.00042 


Protein name 










Locus Name 


. acc#. : 


hypothetical' protein 29.1 








pir :S59084 


S59084 


Description 
















ORF Name 


NTID 


AAID /" 


NT 
Length 


1 AA 
Length 


Score 


Probability 


35161562^ci_27 




2005. 


212 




639 


I 103 1 


. 10 r 0015 


Protein name 










Locus Name 


. Acc# 


RecO 










|gp:AP123492 


AF123492 


Description 



complete sequence . 



96 



ORF Name 



NT I D 



AAID 



4063308 c3 3i> 



T0T>£~ 



NT ;. AA 
Length Length 
507 ' I 11824 



Score Probability 
12257 | |5.9e-234 ~ 



Protein name 



Description ^ 



Locus Name. 



sp:LH>A_HAHINf 



Acc# 
P.43729 



GTP-BINDING PROTEIN LEPA 


ORF Name NTID 


NT 

.AAID — ^ 
. Length 


AA ■ 
T — ^ Score 
Length 


Probability 


4100003_13_20 87 


2007 | |159 


480 | 624 


6.6e-61 


Protein name 




Locus J Name 


Acc# 



Description 



sp:V882_HAEIN 



P44068 



HYPOTHETICAL.. PROTEIN HI0882 


, ORF Name , - NTID AAID 


NT , 
Length 


AA . 
Length 


Score 


Probability ' 


7032838_C3_36 88 2008, 


| 357 


' 1104 - 


276 


2.0e-44 



Protein name 



Locus Name 



signal peptidase I 



gp:EC0K12RIII 



Acc# 
D64044 



Description 
Escherichia coli ribonuclease " III 



and other genes / .complete eds . 



ORF Name 



NTID 



AAID 



NT, ■ AA, 

_ — _ ' ~. Score 
Length Length - ■ 



5655702 13 21 



2005 



Protein name 
Description 
[NO -HIT " 



Locus Name 



Probability 



Acc# 



ORF Name 



NTID 



AAID 



NT AA 
— — '~' Score 
Length - Length — — 



10802330 13 20 



Protein name 
Description 
(NO -HIT - 



Locus Name 



Probability 



Acc# 



97 



ORF Name 



NTID AAID 



12714056 cl 22 



2011 



NT AA 
Length Length 
377 I 111-34 ■ " 



Score 



11472 



Probability 
|9.1e-151 



Protein name 



Locus Name 



putative tormaldehyde dehydrogenase 



gp:PSP24 3 941" 



AccJ 
AJ24 3 941 



Description 

Pseudomonas sp. strain HR199 partial vanB, tan, gcs , ehyA and ehyBgenes . 



ORF Name 



NTID AAID 



NT 



AA 



Length Length 



Score • ,■ Probability 



14«44626_c2_34 


\ 92 


2012 


|202 | 


P9 | |93 | 


|0.028 


Protein name 








Locus Name 




Acc# 


transcription 


regulator, 


TetR tamiiy 




pir:F75482 ■ 




' F75482 . 


Description 














ORF Name 


NTID 


AAID ■ 


NT 
Length 


AA ; sc ■ 
■ Length 


Probability 


15705056_ci_24 


53 


2013 


72 • 


|2 19 






Protein name . 








•Locus Name 




■ Acc# 


Description 














NO-HIT ■ , / . ' . . •-. . 


' ORF Name 


NTID 


/. AAID ' V 


NT 

Length 


' ' AA ■ 
„ — Score 
Length 


Probability 


159667_c2_31. 


| 94 


2014 ; 


61 


■ 204 






Protein name - ■ 




> - 




: , ~ . Locus Name 




Acc# 


Description 














NO-HIT '• " ' • .• • , , . 


' ORF Name* 


' NTID 


AAID 


NT," 
Length 


• AA 

, • — , Score 
Length 


Probability ' 


30079bl2_t3^17 


95, 


2015 


76 


pi , |87 | 


(0.00053 



Protein name ■ 

Description , '.' 
NITROGEN FIXATION PROTEIN FIXS 



Locus Name 



sp::PIXS_RH-lMlf! 



Acc# 
P18399 



ORF Name 
25578402 Tl 7 



NT ID AAID 



12016 



NT AA 
Length ' Length 
TO 1 . 11326 



Protein name 



Score Probability 

[Hi 6 | |4.8e-113 ~ 

Acc# 



: Locus Name ' 



sp : YEEP_BCOLI 



P33016 



Description , . 
HYPOTHETICAL 49.8 KD TRANSPORT PROTEIN IN SBCB-HiyL INTERGENIC .REGION 



ORF Name 


NT 

NT ID AAID — , 
Length 


AA 

; — - - , Score 
Length 


Probability 


J-7J.VJO/0 J_J5. ID 


97 | |2017 | 501 | 


1506 | 742 | 


2.1e 




Protein name 




Locus Name 




Acc# 






spiYDIUJICOLI 




P77649 :,P76 
904 . . • 


Description 








HYPOTHETICAL 54 


4 KD PROTEIN IN AROH-NLPC INTERGENIC REGION 




i 


ORF Name \ 


NT 

NT ID AAID , 

Length 


AA 

. — . , Score ' 
Length 


Probability , 


5097812_c2_30 


98 .. |2018 ; 120 | 


363 . | |280 | 


|1 v 9e 


-24 


Protein name 




Locus Name. 




f Acc# ' 






. sp : YAIM_E(JUL1 




P.51025 :P77 
317 ' 


Description- 








HYPOTHETICAL 31 


4 KD PROTEIN IN MHPT-ADHC INTERGENIC REGION 







. ORF Name , 
S2T33T8 12 10 



■NTID AAID. 



~5T 



2019 



NT 
n 



AA ^ 
— : — Score i 
Length Length ■ 



WTTT 



Probability 

II. be -lb -,.:»■:■ 



Protein name 
hypothetical . protein HP0861 



Locus Name 



jpirTl 



E64627 



Acc# 
E64627 



Description 

ORF Name 
16740877 13 lb 



. NTID AAID 



2020 



NT , AA 
Length . Length 
1406 



Score - . Probability 
1221 I 1639 I", .|1.7e-62 ~ 



Protein name 



Locus Name 



stearoyl-rCoA desaturase 



gp:AP026401 , 



Acc# 
AF02640l" 



Description , 
Mucor rouxii 



stearoyl-CoA desaturase 1 (Olel) gene, complete eels . 



ORF Name 



NT ID 



AAID 



994001 cl 23 



TUT 



TOTT 



NT 
n 



AA 

t — "^v Score 
Length Length _ . — —■ ^ 



FIT" 



S7T 



Probability 
l:7e--55 "~ 



Protein name 



Liocus Name 



sp :-YHIS_BCOLI 



Acc# 
P33018 



Description ■ ' " ' 

HYPOTHETICAL 31.3 KD PROTEIN IN FOLE-CIRA INTERGENIC REGION 



ORF .Name 


NTID 


AAID 


NT 
Length 


AA 

T — , . Score , 
Length ■■ 


Probability- 


1048137_c3_Sb- 




. wh • 


1 67 1 


I 204 .1 




Protein name 








Locus 'Name 


Acer . 


Description 












NO-HIT ; . - / .- ; 


. ORF Name 


NTID, " 


AAID 


' NT ' 
Length 


- — . , Sc.dre 
Length ,. 


Probability, 


i0585925J:IJJ 




2023 


73 


1222 




Protein name 








. - Locus Name , 


Acc# 


Description 












NO-HIT . . 


ORF Name 


" NTID 


AAID 


NT ; 

'= Length ; 


AA 

- ' — . Score 
Length -. 1 


Probability 


i48859I0_c2j.bl 


| 104 


2024 


86 


258 . ; [71 , | 


|0.025 


Protein name 








Locus Name 


Acc# , 


Pa.gK . 




i 




gp:AF013775. 


.AF013775' 



Description 



Salmonella typhimurium PagK (pagK) " 
complete cds 



PagM (pagM) , and PagO (pagO) genes, 



ORF Name 



NTID AAID 



22554587 c3. 57 



TOT" 



NT AA 
Length ' Length 
159 <: 



Score Probability 
1480 I |1.2e-45 ~ 



Protein name 



Description 



Locus Name 



sp : SMPEJ^OLI 



Acc# 

P32052 :P77 

Oil ,. ' 



^ MALL PROTEIN 5 (18 . 3 KD PROTEIN) 



NT AA 

ORF Name NT ID AAID ,— > — , Score Probability 
■ — • - Length Length — = 



23437838_t3_28 


|106 2025 


725 :i 


2178 


I1684J 


|3.1e 


-173 


Prate in. name 








Locus Name 




Acc# 










spibNLJJIAEiN 




'P43813 


Description • 


















• ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


23468813^t3_22 


107 | 2027 




930 


294 


6 .2e 


"'26 


Protein name' 








Locus Name 




Acc# 


putative permease B 


itE ••; 




gp:SHU75349 




U75349 


Description 
















Serpulina" hyodysenteriae bit - pperon, 


complete sequence. 






ORF Name 


NTID AAID . 


. NT 
■Length 


AA 
Length 


Score 


Probability 


234807_tl_ll . 


|108 2028 


175 


528 


456 


4 ,2e 


-43 


Protein name 








Locus Name 




. Acc# . 


' lipopolysacchande 
kdtB homolog ■ > " 


core biosynthesis 


protein 




pir:S72166 




S72166 \, 
















Description 
















ORF .Name ,. 


NTID AAID- 


NT 
Length 


AA 
Length 


Score 


Probability ' 


23705040jr2_16- ' 


109 - | 2029 


550 ;.- 


1083 - 


F° 1 


|8 . 8e 


.66 ■ 


Protein name 








Locus Name 




Acc# 










sp;: POTAJHtAEIN 




P45171 


Description. 
















SPERMIDINE/ PUTRESCINE TRANSPORT ATP- 


BINDING PROTEIN 


POTA 






ORF Name 


NTID ! AAID 


, NT AA' 
Length Length 


Score 


Probability 


2372<^87_r2_i7 


110 2030 v; 


335 


|1008 | |745 | 


|9:9e 


-74 


Protein name 








Locus Name 




Acc# ' 


conserved hypothetical protein yddN 




pir:P69776 




F69776 



Description 



101 



ORF Name 



NT ID 



AAID 



TTT 



NT AA 
Length Length 
219 I 1660 



Score Probability 



Protein name 



Description 



Locus Name. 



ACC# 



NO-HIT - - 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


24252305>_12_19 J 


112 


2032 


258 • 




897 


155 , 


• 8.2e-09 



Protein name 



Locus Name 



'sp-:YDPC_BACSU 



Acc# 
P96680 



Description 

HYPOTHETICAL 33.6 PROTEIN IN .CS&C-NAP INTERGENTC REGION 



ORF Name 



NTID. 



AAID 



24797312 ±1 4 



TTT 



Protein name 



hypothetical protein PHI 11 4 



Description 



NT AA 
Length Length 
T75 — 



Score 



TTT 



Locus Name 



pir:C71052, 



Probability 
5:0e-06 _r " 



Acctf 
C71052 



ORF Name 


NTID AAID ': 


"NT 
Length 


AA .■ 
- — , Score " 
Length \- r --■ 


Probability 


25901467 c3 5.4 , 


114 • 2034 


88 ■ 


267 .,,.7 . 




Protein name 






Locus Name 


Acc#. . '. , 


Description ' 










NO -HIT v " ■ . ; ' ■ ' v. . ■■ v ■ : - 


ORF Name 


NTID AAID 


NT 
Length 


AA 

. — . , '.- Score..* 
Length 


Probability 


3027205ij±3_24 ' 


115 2035 


248 


|747 | .|169 


2 -2e-ll . 


Protein name 






Locus Name 


Acc# . 


probable morphological 
differentiation-associated protein 


bir:T36679 


T36679-, 





Description , 



ORF Name 



NTID AAID 



3323^02 ti 23/ 



TUT 



OT31T 



; .NT ' AA 

Length ' Length 
TZZ 



— , . Score 



OTT" 



Probability 
|2.3e-19 



Protein name 



Locus Name 



permease protein 



Igp : -CJAJ750 



Acc# 
AJ000750 



Description 



Campylobacter 


jejuni malF 


gene, partial. 






ORF Name 


NTID 


NT 

AAID. — , 
Length 


AA 

— , Score 
Length 


Probability 


359765i0_t3__30 


117 


2037 . 8? 


|270 | 343 


4'.0e-3i 



Protein name 



Description 



Locus Name . 



pinFEKRV' 



Acc# 

S72167 :S78 
121 :A00210 



ORF Name 



NTID . ' AAID 



35383542 ri 13 



TTT 



OTT3" 



NT ' AA 

Length , Length 
107 



Score Probability 
321. | p~ 



Protein name 



Locus Name 



KH type splicing regulatory protein 



; gp:HSKHSRP3 ~ 



Description 



|5.9e-05 



Acc# 
AF093747 



Homo sapiens KH type splicing regulatory protein (KHSRP) gene/ exon2 and 
partial' cds. , ''. '" '■■ • •' • 


ORF Name 


NTID AAID , 


■ NT : 
Length 


. AA 
Length 


Score . Probability 


3923288_clJ39 . 


119- . 203 9 


343 ; 


1032 


254 l.le-21 



Protein name 



Locus Name 



■ probable regulatory protein (ptoS/R) > 



pir:E71373 



, Acc# 
E71373 



Description 



ORF Name 



NTID AAID 



3938393 c3 64 



TOT" 



OTTO- 



NT AA 
Length Length 
[2T3 — 



Score Probability 
. [TOT" 



1.0e-71 



Protein name 



Locus Name 



uracil phosphonbosyltranslerase, upp 



pTr : A65026 



Descrip tion 



Acc# 

A65026 :S23 
412.. 



103 



ORF Name 



NT I D 



AAID 



NT AA 
Length Length 



Score Probability . 



4064638_tl_3 |121 2041 , 371 ■ 


1115- 


152 


4,7e-08 


Protein name 


Locus Name 


. Acc# 




sp:Y131 


JEiAEIN 


P43951 


Description 








HYPOTHETICAL PROTEIN HI0131 PRECURSOR 


ORF Name NTID AAID , ^ ; 

Length 


. AA ' 
1 , — . , , Score 
Length 


Probability 


|41ul56a_13_29 |122 | |2042 263 


1 


512 


4 . ye-49 . 



Protein name 



' Description 



Locus Name 



Acc# 
• £56691 



(NADPH-FMN 0XID0REDUCTA5E) 


ORF Name' NTID AAID 


• NT 
Length 


AA 

— r , Score , 
■Length 


Probability 


682641. cl 33 . . |123 2043 1 


86 


, 261 |100 | 


|2.2e-05 


Protein name " ' ' 




Locus Name 


Acc# 


hypothetical protein PH0217 1 




pxr: 071244 


G71244 


•Description ■ / ,, 








ORF Name ' ' NTID" AAID . 


•NT 
Length 


AA . , . 
. — . , Score 
Length 


Probability 


s ipV90_t3_6fl ; |124 " 2044 . • 


• 731 . 


. 2196 594 . 


9.0e-86 



Protein name 



Description 



Locus Name 



sp : : PRIM_HAEIN 



Acc# 

: 

Q08346' 



DMA IMA.SE, ... ' ... • 


ORF Name 


NTID 


AAID 


NT / - 
Length 


AA - ' 
, — ' _ Score ... 
Length v 


Probability 


|ii9ul2_c3_ii8 


| 125 


j 12045 


| 438 


(1317 ; [1830,] 


|l.le-;188 


Protein name 








Locus Name 


■ ACC# : ' 



Descript ion , . " : . 

HYPOTHETICAL PROTEIN HI0125 



sp:YJCt) HAEIN : 



P44530 



104 



ORF Name 


NT I D 


AAID 


NT AA 
t — L1 T — . , Score 
Length Length 


Probability 


12214386_c3__117 


|12 6 


2046 


125 ■ | 


3 78 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 


ORF Name 


NT-ID 


AAID 


NT 
Length 


AA score 
Length 


Probability 


12540957_c3_i2i . 


127 


2047 - 


280 


843 227 


7. 7e 


-19 


Protein name 










Locus Name 




Acc# . 


probable ytiH protein . 




pir:A70579 


A70579 


Description 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


12593961_t2_35 


128 


2045 


68 | 


207 






Protein name 










Locus Name 




Acc# ; 1 


Description ' 
















NO-HIT ; - 


ORF Name. 


NTID 


AAID 


NT 
Length 


AA 

— ■■ ' Score 
Length 


Probability' 


19532813_i3_73 


129 ' 


2049 


134 


405 ; , 252 


1 . 7e 


-21 


Protein name . , 1 ' 










Locus Name ; 




Acc# 


RpsT protein 










gp : VCNHAR 




■ , AJ002395 


Description 
















Vibrio cholerae.nnaR, niyu, 


mviN, 


and rpsT genes . 






ORF Name 


NTID 


AAID ' 


NT 
Length 


AA 7 - 
„ — , , Score 
Length 


Probability ■ 


209375_ci_9b / 


130 ., 


2050 . 


|Vb0 | 


2250 |1867 | 


1.3e 


-192 


Protein name 










Locus Name 




Acc#- 












sp:CLPA_ECOLI 


P15716 :P77 
686 


Description • *• 















105 



ORF Name 



NTID 



AAID 



21641077 t2 49 



NT AA 
Length Length 
199 



Score 



Probability 
|4.3e-08 ■ 



Protein name 



Locus Name 



hypothetical protein 



gp : SYC5LLE 



Description 



Acc# 

D64003 :AB0 
01339 



Synechocystis spy 


PCC6803 


complete 


genome, 22/27, 2755703-2858766. 


ORF Name 


NT ID ' 1 


■'■'AAID 


NT AA 

— — . — _' . Score Probability 
Length Length — — 


22143827_ci_89 


152 • 


2052 


250 • 753 . . ; .245 ■ 7.5e-21 


Protein name • 






i! . ■ Locus Name ' Acc#. 



Description 



sp:YiV8_VEAST 



P40582 



HYPOTHETICAL 26.8 


KD PROTEIN IN HYR1 


v3 1 REGION ■ 








. ORF Name 


NTID . AAID 


NT 
Length 


AA .' 
_ — Score 
Length 


Probability 


22453453 . c2 104 


1 133 , 1 .F 3 \ 


426' 


1281 • 492 - 


6:4e 


-47 


Protein name 








Locus Name ■ 




Acc# , 


carbqxyl - terminal 


proteinase 






pir:F703,69 


F70369 


Description 














" ORF" Name , 


i 

NTID AAID r , . 


NT 
Length 


AA 
Length 


, Probability. 


22831262_ci_.94 


•134 2054 " ; 


128 .. 


387 |185 | 


|2.2e- 


-14 ; | 


Protein name ". 








: Locus Name . 




ACC# 



sp : YLJA_ECOLI 



P75832 



Description ■'. ' ; . .• 

12.2 KD PROTEIN IN CSPD-CLPA INTERGENIC REGION 



ORF Name . ■ 
|23532215_t2_59 

Protein name 
Description 
IMO-HiT ; " 



NTID AAID 



TT5~ 



NT AA 
Length Length 
64 I [X1T5 — 



Score ' Probability 



Locus Name 



Acc# 



106 



ORF Name 



NT ID AAID 



23545875 cl 84 



NT AA 
Length Length 
505 I • 11818 



Score Probability 



2.1e-71 



II 



Protein name 

, Description . - 



Locus- Name 



sp:CYt)D ECOLI 



Acc# 

P29018 :Q47 
656:P77275 



ORF Name 



NTID AAID 



23875303 c2 10y 



TTT 



NT . AA 
Length Length 
72 



Score Probability 



[2T9~ 



Protein name 



Description 



Locus Name 



Acc# 





ORF Name 


NTID - 


AAID 


NT 
Length 


AA 
Length . 


Score 


Probability 


24236S42_cl_91 


138 . 


2058 


350, 


1053 


695 


2.0e-68 



Protein name 1 • 

*■• ■■ ■ ■* .• , 

Description 

(PSEuTDOURIDYLAte synthase; (uracil hydrolyase; 



Locus Name 



sp:RLUD_E00Ll 



Accfr . 

P33643 :P77 
003 



ORF Name 



NTID AAID 



NT AA 

— ' — , Score Probability 
Length Length — — - - 



24317757 ±3 67 



12059 



TTUT 



TBT" 



4.4e-32 



Protein name 



Locus Name . 



sp : YPIY_P^EAE . 



Acc# 
P33641 



Description 

HYPO T H E TICAL 38. b> KB LIPOPR OTEIN IN PIL5 5 1 REGION PRECUR50U (ORFY) 



NT 



AA' 



ORF Name • NTID AAID , — , / — ., S core Prob ability 

Length Length — — — - 



24318805 12, 60 



TT0~ 



TFT 



14 ,8e-.l? 



Protein name. 



Locus Name • 



hypothetical protein. 



gp:ASA224767 



Acc# 
AJ22476 7. 



Description 
Acmetobacter sp. ADP1 Ion gene and ORFs . 



107 



ORF Name 



NTID 



AAID 



24417012 12 52 



T3T" 



NT • AA 
Length Length . 
252 



Score 



[TOT" 



Probability 
10 /Oil \ 



Protein name 



Locus Name 



LpsB 



gp:AFl?3023. 



Acc# 
AF193023 



Description 



SmorhizGbium meliloti GreA (greA) " LpsB (IpsB) , LpsE (lpsE) , LpsD (IpsD) " 
LpsC (IpsC), and Lrp (lrp) genes, complete cds . 



ORF ■ Name 



NTID 



AAID- 



24550052 c3 119 



, NT AA 
Length , Length 
234 



Score 



Probability 
3.1e-09 



Protein name 



Locus Name 



hypothetical protein C33F10. 3 



pir : T15-74 5 



Acc# 
T15745 



Description 



ORF Name 



NTID AAID 



246B9bO c3 123 



NT AA 
Length Length ' 
57 



Score 



TOT" 



Probability 
i.4e-05 



' Protein name ; 
Description 



Locus Name 



sp : COPAJIULl'U 



Accf 
032619 



ORF Name 



NTID 



AAID 



25391941 c2 115 



NT ' • "AA • 
Length ' Length 
298 < r 



Score ■ Probability 1 , 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT ■ ■ • .- 


ORF Name 


NTID 


AAID 


NT 
Length 


AA „ : 
T — , , Score 
Length 


; Probability 


261635^11_15 




J 12055 


1 P 17 


654 |503 | 


i.le 


^ ■ .) 


Protein name 










Locus Name 




Acc.# ' 


response regulator 


GacA 








gp:AF115381 


AF115381 



Description 



Pseuclomonas aureolaciens 30-84 response regulator GacA igacA) gene , complete 

cds / . ' . ., . . ' : 



ORF Name 



NTID 



AAID 



51431512 tl 22 



NT ■' 
Length 
182 ? 



AA 
Length 
'|b49 



Score .. Probability 



|4.8e-26' 



Protein name 



Locus Name 



bacterioterritm comigratory. protein 



xr-:'P71S7i ■ - 



pir : 



Acc# 
F71971' 



Description 



ORF Name 



NTID AAID 



31832188 c2 114 



NT . 
Length 
440 



AA 



Score Probability 



Length 
1323 | [1025 | |2.1e-103 



Protein. name 



Description 



Locus Name 



sp:Y290 HAE1N 



Acc# 
P77868 



PROBABLE CATION- TRANSPORTING AT PAS hi 


HI0290, , 








ORF Name • . . NTID ' AAID 


NT . 
Length 


AA ., 
Length 


Score 


Probability 


33845302_c2_iib , 148 2058 


288 . 


.857 


" 553 


5.6e-64 



Protein name ■ 

Description 1 
PROBABLE CATION -TRANS PORT TNG ATPASE HI0290, 



Locus Name 



sp:.Y290_HAt!lN 



Acc# . 
P778 68 



ORF Name 



NTID .AAID 



35974750. 12 '38 



WW 



2uW 



NT ;■ 
Length 
261 



AA 1 
Length 
786. v 



Score Probability 
|503 ' | [l.le-58 ; ~ 



Protein name 



Description 



Locus Name 



sp:YB(3I_HAi!lN 



Acc# 

Q57354 :O05 
008 



HYPOTHETICAL 


PROTEIN HI0105 










ORF Name 


NTID : AAID . 


NT 
Length 


./ AA 
Length 


Score 


Probability . 


4806.512_c2_96 


150 . 12070 


■ 463 


|1392 


1501 1 


7.7e-lb4 



Protein name 



Locus Name 



hypothetical protein 7 . " 



pir:T00129 



Acc# 
T00129 



Description 



109 



ORF Name 



NTID AAID 



5109545 c2 99 



2071 



NT 



• AA 

T "^l'ij Score 
Length Length ■ - — 



Probability 
|4.3e-45 



Protein name 



Description 



Locus Name 



sp:.CTOC_BCOLr 



Acc# 
P23886 



TRANSPORT ATP 


r BINDING -PHOTK IM ■ CVDC ■ 












ORF Name 


NTID AAID 


NT 
Length 


• AA . ' 
Length . 


Score 


Probability 


S7I8_c2_103 


• 1152 2072 
1 


531 | 


: 1595 


(145.7- 1 


|3.5.e-149. 
L ' 


Protein name 








Locus Name 


A.cctf 










sp : PMGI_ECOLI 


P37689 


Description 














— ivr* ^ a — ^ — n — 




(BPG- INpEPENDENl 


1 PGAM) ' 




ORF Name 


*' ■- . NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


5837753 J: 1__23 


153 2073 


224 : 


.675 • 


1147 I 


|3.2e-.08 < 


Protein, name 








Locus Name 


' . Acc# 


capm protein 


(capMlj- RP344;-- 






pir:B71691- 


B 7 16 91 


Description 














ORF Name 


NTID AAID' ' ' 


NT. . 
Length 


AA. 
Length 


Score 


Probability 


789Sil_.ci_U« 


154 2074 


892 : - 


2679 


J2203 | 


|2.3e-2bb 


Protein name 








Locus Name, - 


Acc# 



Description 



sp:CTRA_ECmi 



P09097- 



DMA GYRATE 
















ORF Name 


NTID 


AAID ' ; . 


NT 
Length 


AA 
. Length 


Score 


Probability - 


&S(S-«ft_c3i5 


155 


.|2075 


|262 


759 , : 


1149 


I.5e\-I16 : j 



Protein name 



Locus Name 



multidrug transporter homolog 



pir :G69005 



Acc# ' 
G69005 



Description 



ORF Name 



NT ID 



AAID 



NT AA 
Length Length 



12985037 c2 ,42 



|156 




2076 




158 




477 




354 





Score : Probability 
|1.9e-31 . 



Protein name 



Description 



Locus Name 



|sp:PTLO SEAE 



Acc# 
P34750 



FIMBRIAL ASSEMBLY .PROTEIN 


PILQ PRECURSOR 








ORF Name NT ID 


AAID 


NT 

Length 


AA -•' ' 
T — , ' Score, 
Length 


Probability 


14301467_c3_49< |ib7 


| 12077 : 




|696 316 


2.9e-28 ■ 


Protein ' name 








Locus Name 


' Acc# 


carbonic anhydrase 




pir :D75298 


D75298 


Description 












ORE Name NT ID 


AAID 


NT 
Length 


AA Score 
Length 


Probability 


i^Tcl__31 ., .. ;■ ; 158 


2078 


501 


1506 |I393 | 


|2.1e-142 


Protein name ... 








Locus Name 


Acc# 



sp : YLEAJHAE IN 



Q5 7163 



Description 
HYPOTHETICAL PROTEIN HI00,19 



ORF. Name ■", 
19615687 FT 6 



NTID 
1159 



AAID 



NT 
Length 
87 ' 



AA „ • 
T Score 
Length 



Protein name 
Description : , 
[NO-HIT " 



Locus Name 



Probability 



Acc# 



ORF Name 



NTID AAID 



23445308 12 18 



2080 



NT 
Length 
224 



AA 
Length 
' 1672 



Score Probability 



Protein name 
Description 
IKfO-HIT, 



Locus Name 



Acc# 



111 



ORF Name 



NTID 



AAID - 



239b9b^2 ci yi 



NT AA ! 

Length ■ Length 
233 



7uT" 



'Score Probability 
1140 I |1.7e-09 



Protein name 



Locus Name 



pilus expression protein 



gp : t>SEP0NA 



Acc# 
L28837 



Description 



' Pseudomonas syrmgae' penicillin binding protein (ponA) , membraneprotems 
(pilN, pilO) , pilus. expression proteins (pilM, pilP)genes, complete cds and 
pilus expression protein (pilQ) gene partial cds . 



ORF Name 



NTID 



AAID 



24040911 cl 33 



MI 



NT .. • AA 
Length Length 
327. 



Score 



T2JT 



Probability 
S.3e-30 



Protein name 



Description 



Locus Name 



sp:PILQ_PSEAE 



, Acc# 
P34750 



FIMM1AL ASSEMBLY. PROTEIN PILQ - .PRECURSOR 



ORF Name 



NTID 



AAID 



34510950 c2 39, 



NT AA 
Length Length 
1545 ~ I 11938 



Score Probability 
|201 | |4,9e'-15 ~ 



Protein name. 



Locus Name 



membrane protein 



IgpiPSEPONA 



Acc# ' 
L28837 



Description > 



Pseudomonas syringae penicillin binding protein (ponA) , membraneprotems 
(pilN, pilO) , pilus expression proteins (pilM, pilP)genes, complete cds and 
pilus expression protein' (pilQ) gene /partial • cds . 



ORF Name 



NTID 



AAID 



34589051 cl 36 



NT AA 
Length Length 
183' 



Score 



Probability 
1.8e-33< ■ 



Protein name 



Locus Name 



lactoylglutathione lyase, : glyoxalase • I 



pir:A46714 



Description 



ACCj 

A46714 :A46 
623 



112 



ORF Name 



NTID 



AAID 



4504595 ci. 34" 



2uW 



NT AA 

'■ — — — Score 
Length Length '-f — . 



37b 



Probability 
|2.4e-88 ~ Tr ~ 



. Protein name 



De script ion ■ 
3 -DEHYDROQUINATE SYNTHASE , 



Locus Name 



spiAROB^NHIGO 



. Acctf 
050468 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

Score 

Length 


Probability 


4877328_cl_35 


• 155 


. 2085 


318 | 


957 




Protein name 








Locus Name 


Acc# • 


Description 












MO-HIT 












ORF Name 


NTID 


AAID 


NT 
Length 


AA, scor 
Length 


' Probability 


7042153_c2_43 ■ 




2087 


231 


' 595 452 


1. ie-42 


Protein name .. 








Locus Name 


Acc# 



Description 



sp : A£0K_HAEIN 



P43880 



SHIKIMATE 


KINASE, 


(SKJ 












ORF Name 




NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


7 083457_c3_ 


.4 6' 


168, -i 


2088 


215 ■ 


551 • 


154 


4.2e-ll 



Protein name 



Locus Name 



limbrial assembly protein pilO 



pir:577728 



AccJ 
S7772 8 



Description 



ORF Name 



NTID 



AAID 



23703142 en 



[2uW 



NT AA 
Length Length 

7u73 — n 



Score 



^T5~ 



Probability . 
4.5e-52 — 



Protein name 



Locus Name 



sp :'YJEK_ECOLI 



Acc# , 
P39280 



Description 

' HYPOTHETICAL 38:7 KD PROTEIN IN MOPA-EFP INTERGENIC REGION 



113 



' ORF Name 



NTID 



AAID 



34119052 tl I 



T7TT 



2090" 



NT AA 
Length Length 
204 



Score 



Probability 
|4.9e-65 



Protein name 



.Locus Name 



translation elongation tactor EF-P 



pir:S34443 



Description ' 



Acc# 

S34443 :S56 
375 :A65225 



ORF Name 


NTID 


AAID 


NT, 
■ Length • 


AA - 
t ~~ . i Score 
Length 


Probability 


32712915_c2_17 


■171 


2091 


77 , 


234 






Protein name 








Locus Name 


■■■ Acc# 


Description 














MO -HIT 


ORF ' Name 


NTID 


AAID 


- NT 
Length 


, AA 

' — ; ■ Score 
Length 


Probability 


33984701_t3_10 


172 


2092 


579 


1740 


1233 


; l : 9e-125 



Protein name 



Locus Name 



|sp:-PMSR_MEISO 



Acc# 
P14930 



Description •• . /. .. •' 1 ■ - - ' \ 

PEPTIDE METHIONINE SULFOXIDE REDUCTASE (PEPTIDE MET (0) REDUCTASE) 



ORF Name 



NTID 



AAID V 



36131500 c3" 21 



ITT 



NT AA 
Length Length 
1308 >" 



"9T7" 



Score ' Probability 



3.4e-£4 



Protein name 



Locus- Name 



sp:HTPX_ECOLI 



Acc# r 
P23894 



Description '. 
PROBABLE PROTEASE HTPX,' (HEAT SHOCK 0ROTHIH. HTPX) ■ 



ORF Name 



NTID AAID 



3907578 cl 15 



T7T 



TOW 



NT AA 
Length Length 
299 



Score 



Probability 
|2..ie-55 



Protein name 

Description 
PYROPHOSPHORYLASE) 



Locus Name 



sp:DHPS_ECOLI 



Acc# • 

P26282 :P78 

110 "l " 



114 



ORF Name 


NT ID 


AAID 


MT' 
JM ± 

Length 


AA 

~ ' Score 
Length 


Probability 


4100312_i3_13 


| 175 


2095. 


|106 | 


321 




Protein name 








Locus Name 


' Acc# 


Description 












NO-HIT • ■ ■ 


■ ORF Name 


. NTID 


. AAID 


NT 
Length 


AA 

■ — , Score 
Length 


Probability 


48828062_c2_16 


|176 


2096 


' 115 : 


348 , -* 223 


i.2e-!6 



Protein name 



probable transglycosylase 



Description ' 



Locus Name 



pir :T12 796 



Acc# . 

T12796 :A69 
911 



ORF Name 



NTID AAID 



831319. t2 7. 



nrrr 



Protein name 



Description 



12097 



■NT , 
Length 

— 



AA 

- — L1 Score " Probability 
Length - — • — — — : — r 



1419 1 IT^T 



Locus Name 



1.4e-124 



sp-:HPLX_ECOLI- 



Acc# 
P25519 



OTP -BINDING MOTE IN 


HPLX 






ORF Name , 


NT 

NTID . AAID — ■ < 
• Length". 


AA 

J — . r Score. 
Length 


Probability. 


87Q250_£2_6 


178 2098 255, 


|768 ;| 394 


i.-6e-36 1 


Protein name ■. 




Locus Name 


Acc# 


hypothetical protein in endA^gshB mtergenic 


■ pir:A65080 


A65080 


region 














Description 








ORF Name 


■-, . NT 
NTID AAID . . — 

Length 


AA . 
— ' ' ■" Score 
Length 


Probability 


i0548386_t2_i'9 | 


|179 2099" .647; 


1944 | 





Protein name 
Description 
(NO-HIT — 



Locus Name 



Acc# 



115 



ORF Name 



.NTID 



AAID 



10626553 c2 94 



rsrr 



NT ■ 
Length 
HT2~ 



* AA 

T — ^-u Score 
- Length ■ — 



L_J 



Protein name 



Description 



Locus Name 



,sp:TEGR_HSVll 



Probability 
10.039 ~ 



Acc# 
P06481 



TEGUMENT PH0SPH0PR0TE1N US 9 (10 KD 


PROTEIN) 






ORF Name NTID AAID 


NT. 
Length 


AA - 
, — . , Score 
. Length 


Probability 

•V 


|1178127_il_i0 | |181 2101 • 


445 | 


1335 1319 | 


jl.be- 134 | 


Protein name 




Locus Name 


Acc# 



.|sp:5Y5_HAt!lN 



P43833- 



Description . 
S ERYL - TRNA . SYNTHETASE , (SERINE- -TRNA LIGA5E) (3 ERRS) 



ORF Name 



12109686 cl '6.3 



Protein name 
Description , 
[NO -HIT " 



NTID AAID "■■ 



2102 



, NT 

Length 

[55 — : — 



AA 

T ' — Score 
Length ^~ 



Probability 



2TTT" 



Locus Name. 



Acc# 



ORF Name 
12992085 12 26 



Protein name 
Description 
NO-HIT : 



NTID . AAID 



TST" 



NT 
Length 

wr~ — " 



AA 

T . Score 
Length — - 



Locus Name 



Probability 



Acc#. 



ORF . Name 



1369428 c2 97 



.' Protein name 

Description 

NO-HIT 



NTID AAID 
184V 



TTuT" 



NT 
Length 

78 " : I 



AA 
Length 

1337 — 



Score : Probability 



Locus Name 



Acc# 



ORF Name 


NTID 


; AAID .. 


NT 
Length 

mL 


AA 

-_ r — Ll Score 
Length 


Probability 


13 71092 5_t:J_46 


185 


2105 


148 




447 - 652 


T.le 


- b4 


Protein name 












Locus Name 




Acc#, 














sp:MTlC_MORBO 


P34721 


Description 


















METHYLTRANSFERASE 


; mb or c) 


(M.MBOI 














ORF Name 


NTID 


AAID 


NT : 

Length' 1 


AA 

— Score 
Length . 


Probability 


|14i264i_cIJ>b . 


|I85 




| 147 | 




444 1 88 


|0. 00042 


Protein, name 












Locus Name . 




■.. Accft . 


h 












sp:YRKI_BACSU 


P544 36 


Description 


















HYPOTHET I GAL 8.2 


KD PROTEIN. IN BLTR-SPOIIIC INTERGENIC REGION 






ORF Name 


NTID 


AAID 


NT 
Length 


• AA ' 
- - — • ' Score 
Length 


Probability 


14250312 C2_100 


, 187 , 


| 2107 


245 




741- • 






Protein name • , „ 












Locus. Name 




Acc# 


Description 


















NO-HIT ■ • , ■: ' - •,- . 


ORF Name r'' 


NTID 


'aaid' 


■ NT 
v Length 


' AA • 
, — . ,-. Score 
Length 


Probability 


I433466_c2_lil s . 


, 188 ; 


2108 


85 




258 141 


9.6e 


-09 y\ 


Protein name 












Locus Name 




Acc# 














sp:MVIN__ECOLI. 


P75932* ■"' 


Description 


















VIRULENCE FACTOR 


MVIN HOM0L0G 














ORF Name . 


NTID' 


AAID 


NT . 
Length 


AA n 
— . . Score 
Length : 


Probability 


14875390J:3_51 


|189 


| |2109 


1 




405 302 I 


|l.pe- 




Protein name 












Locus Name 




Acc# , , 














sp:YAEL__ECOLI 


P3.7764 



Description 



HYPOTHETICAL 49.1 KD PROTEIN IN CT>SA-HLPA INTERGtlNIC REGION 



117 



ORF Name 


NT ID 


' AAID ' 


IN I 

Length 


, AA 

T — . , Score 
..Length ; , 


Probability 


15020887_cl_83 . 


190 


2110 


189. 




570 ,. 




Protein name 










Locus Name 


Acc# 


Description , . : 














NO-HIT ; v ■ \ ' ■ 


1 ORF Name' 


NT ID 


AAID 


NT 
Length 


'« 

. — . " - Score 
'Length 


Probability 


15885450ic3_i42 


| 151 


2111 


342 




1029 1 524 


6.6e-6i - 


Protein name . 










Locus Name 


Acc# 



P44958 



Description 
VIRULENCE FACTOR MVIN H0M0L0G 



ORF Name 



NT I D 



AAID 



166043 ±1 8 



2112- 



NT AA 
Length Length 
259 -I . 1780 



Score 



Probability 
7.3e-39 



Protein name 



Locus Name 



cytochrome c maturation protein B 



gp:AF044582. 



Acc# 
AF044582 



Description 



Shewanella putretaciens NrtG homolog gene, partial cds; ! andmono -heme ! c- type 
cytochrome ScyA (scyA) , cytochrome c maturationprotein A (ccmA) , cytochrome c 
maturation protein B (ccmB) ] cytochrome c maturation protein C (ccmC) , 
cytochrome c maturationprotein .D (ccmD). , and .cytochrome c maturation protein 
E (ccmE) genes, complete cds . v ' . ' 



ORF Name 



NT ID 



AAID 



117069628 ti 4 



TTTT 



NT AA ; 
Length Length 
[TOT - 



Score Probability 



, EE3 



Protein name ; 
■ Description 
[NO-HIT — — 



Locus Name 



Acc# 



ORF Name 



NT ID AAID 



1187703 £2 21 



[2TT3~ 



NT AA 
Length Length 
TTT 1 , f3T2" 



Score: 



Probability , 
10.00026 " 



Protein name 



Description 



Locus Name 



sp : Y4AR_RHl.aW 



Acc# 
P55365 



HYPOTHETICAL 12.1 KD FKOTETN Y4AR. 



ORF Name 



NTID 



AAID 



2246275.7 cl 67 



T95~ 



NT, . AA 
Length Length 
67 - 



Score 



[ST" 



Probability 
10.0 003 3 



Protein name 



Locus Name 



hypothetical protein SC6U10.02 



pir :T3548y 



Acc# 
T35489 



Description 



ORF Name 



NTID 



AAID 



23470003 cl tfl 



NT 



AA • ■ 
- — Score 
Length Length — — — 



Probability 
|2.3e-31 ~ 



Protein name 



Locus Name 



sp :MVIN__ECOLI 



ACC# 
P75932' 



Description 
VIRULENCE FACTOR. MVIM HOMOLOC' 



ORF Name 



NTID AAID 



23914017. c2 104 



TTT— 



TTTT 



NT AA ., 

Length Length 
SB "I 



^— ■'" Score 



HI" 



Probability . 
5.5e-.09' : 



Protein name 



Locus -Name" 



hypothetical protein ycLaT 



pir:C69770 



' Acc# 
C6 977 0 



Description 



ORF Name 



NTID .AAID" 



24219792 ±2 34 



NT AA 
Length . Length 
1296 



F9T~ 



Score 



Probability 
|2.'le-41 



Protein name. 



Description 



Locus Name 



sp7Cn57TT3EEE- 



, Acc# . 1 ; 
Q59640 





'■ ORF Name 


* NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


24244033 cl_70. 


199 


2119 


302 | 


. 909 |620 | 


|1.6e-$0 


Protein name 








•Locus Name' 


Acc# 



sp:YGLA__5YNP2 



P28606 



Description 

HYPOTHETICAL 34 .1 Kb" PROTEIN IN GLNA 3 'REGION, 



119 



ORF Name 



NT ID AAID 



24252302 c2 TUZ 



2120 



NT AA 
Length . Length 
493 I 11482 



Score Probability 
|5.ie-12b 



Protein name 



Locus Name 



2 -oxoglut arat e/ malate , translocator. nomolog 
yflS 



bir:P658il 



■ ACC# , 

F69811. 



Description 
ORF Name 



NTID AAID 



AA - 
„ " — ' Score Probability 
Length Length - — — - — — 



E 



433000b c3 vn 



][ 



2121- 



NT 

prc* 



Protein name 



Description 



|1.5e-38 



Locus Name 



gp:AB017194 



Acc# 
AB017194 



Plectonema boryanum ORF27 0, proline; lmmopeptidase, 1 err ecloxin andamidase 
enhancer genes, complete and partial cds . 



ORF Name 



NTID 



AAID 



|246b0962. 13 4b 



TttT 



NT AA , 

Length Length 
'261 , 



7W 



Score ' Probability 
1806 I . |3.4e-80 ~ 



Protein name 



Description 



: Locus Name ' 



sp:T2Dl_STRPN ; 



Acc# 
P093 56 



(k.DPNI).. ... 














ORF Name ' • , 


NTID 


. AAID 


■ ' NT ' 
Length 


> AA . 
Length 


' Score 


Probability ' 

• • •' • j 


24735875_r2JL6 


2 : 03 


2123 


73 


• 222 


I 54 1 


|0.017. 



Protein name 



Locus Name 



sp:W0_YEAST 



Acc# 
Q04210 



Description ,. » 

HYPOTHETICAL id. -2 Kb PROTEIN IN SUM-ARGR1 iNTERGENIC REGION 



ORF Name 



NTID 



AAID 



2b391007 c2 110 



[2W 



NT "., aa: 
Length ' Length 
'216 



Score 



Probability 
i.0e-4l " 



Protein name 



Locus' Name 



N- acetyl - annyaromuramyl - L- alanine amidase 



gp:AF08257b 



Acc# 
AF082575 



Description 



Pseudomonas aeruginosa N- acetyl -annydromuramyl -L- alanine amidase(ampD) and 
transmembrane protein AmpE (ampE) genes, complete cds. 



120 



ORF Name 


NT ID, 


AAID 


NT 

Length, 


AA 

T — . , Score 
Length 


Probability . 


25662 782_£2_24 


... 205 


212b 


258 


|777 | 288 


2.7e 




Protein name 










Locus Name 




Acc# 












sp : CCMA_RHOCA 


P2 995-9 


Description 
















"PROTEIN HELA) / . . " 




ORF Name 


NT ID 


AAID . 


NT 
Length 


AA ScCre 
Length . 


Probability 


289052_cl_66 


206 


|2126 


1 l lb4 


465 ' |220 | 


4.3e 


| 


Protein name 










Locus • Name 




Acc# 


conserved Hypothetical protein ■ 




pir:B75344 




B75344 


Description-. 
















ORF Name 


NT ID 


AAID . 


NT 

Length 


AA ' " 
t ~~ , i Score . 
Length 


Probability 


2930i457_r3_ : _44 


207 


2127 


IP 


282 . ' , 






Protein name 










Locus Name 




Acc# ■ 


Description, 
















NO-HIT V * • .- 




ORF Name 


NT ID 


AAID 


NT 
Length 


AA ; ' 
, — . .. Score 
Length . 


Probability 


29507800 c2 95 


. 208 


2128,, 


397. 


\ 1194 |883; | 


|2.4e 


-88 


Protein name - 










Locus Name 




' Acc# 












sp:RP32_PSEAE 


P42378 


Description 














r, ■ . 


RNA POLYMERASE- SIGMA- 3 2 FACTOR ' 




ORF Name'. 


NT-ID. 


AAID .->- 


NT 
Length 


AA . 

, — . , Score . 
Length. 


Probability.- 


34b$9707 c3 131 


209' 


2129 




288 | |74 | 


10.023 . 


Protein name 










■ Locus Name 




ACC# 


F22C12 .13 




gp:AC007764 y 


AC007764 



Description 



Genomic sequence tor Arabidopsis thaliana BAC F22C12 trom chromosome ; I, , 
complete sequence. , " - 



121 



ORF Name 



NT ID 



AAID 



12130 



NT 



AA ■ 

. I, t Score 
Length Length - — ; — 



Probability 



Protein name 



Locus Name 



sp : YAEL__ECOLT 



Acc# 
P37764-. 



Description 

HYPOTHETICAL 4<M PROTEIN IN" CDSA-HL^A liJTBtttiENIC UUG10N 



ORF Name 



NT ID 



AAID 



36S20625 t2 31 



NT ' AA 
Length Length 
|256 



Score 



[771" 



72T~ 



Probability 
TTTe^TT; 



Protein name 



Locus Name 



UMP .kinase 



gp:AB010087 



Acer ■ 

AB010087 



Description 



Pseudomonas aeruginosa ,rpsB, tst ,. pyrH, trr genes tor ribosomalprotem S2/ 
elongation factor Ts., UMP kinase,- ribosome recyclingf actor , complete cds'.- 



ORF Name 



-NT ID 



AAID 



3907818 t2_32 



ITT 



2132 



NT ' AA 
Length t Length 
|187' | |b64 | 



Score 



Probability 
7.6e-60 ~ 



Protein name 



Locus .Name 



ribosome recycling factor 



Description 



E 



p:AB010087 



Acc# 
AB010087 



Pseudomonas . aeruginosa rpsB , tst, , pyrH, trr genes tor ribosomalprotein S2, 
elongation factor. Ts, UMP kinase, ribosome recyclingf actor , complete eels . 



ORF Name \ 


. . NT • 

NT-ID AAID — , 
Length 


AA ' ■ " . ' ' 
T — ; . Score . 
Length - 


Probability 


391068_±2_33 


213 2133 • 272 


815 B34. • 


2,3e 


-51 ., 


Protein name 




Locus Name 




Acc# 






sp : _WPS_EC0LI 




Q47675:P75 
668 


Description 








( DI - TRANS - POLY - CIS- DEC APRENYLC I STRANS FERAS E ) ■, <. ( 


ORF Name , 


NT 

NTID AAID — ; 

Length 


AA ■* 
„ — Score 
Length 


Probability 




| 214 | 2134 |204 | 


.. 515 | 592^ 


|i,-6e 


-57 " 


Protein name 




, Locus Name 




Acc# : 






sp :TKT1_EC0LI 




P27302 


Description . 










TRANSKETOLASE 1, 


(TM) ' 









122 



ORF Name 



NT ID 



AAID 



3947932 r3 41 



NT 
n 

257 



AA ■ 

t Score 
Length Length — 



Probability 
12 .Oe-05 — : 



Protein name 



Locus Name 



sp : yeezj^oli 



Acc# 
P76370 



Description . . . " 

HYPOTHETICAL KB PRO TE IN- I N 5BCB-H I 5L . I N TE ktJl!Nl<J REGION PRECURSOR 



ORF Name ' 


NTID 


AAID 


NT 
Length 


AA 

T — ^ Score 
Length 


Probability , 


4il06B7_t2_30 


216 


1 2136 


497 


1494 |1775 | 


|7.1e-IU3 | 


Protein name 








Locus Name 


ACC# . 










sp:TKTl_ECOLI 


P27302 


Description 












- TRAN5KET0LASE 1, 


(TK 1) 










, ''ORF Name 


. NTID 


AAID 


NT • 
•Length 


— ; , Score 
Length 


Probability 


4345300_12_20 


217 


| 2137 


989 


2970, |2958 | 


jO.O 



Protein name 



Description 



•Locus Name 



sp:SYV_HAEIN- 



Acc# 
P43834 



VALYL-TRNA SYNTHETASE , (VALINE- -TRNA. LlGASE) 


(VALRS) 






ORF Name NTID AAID . — , 

Length 


AA 
Length 


Score 


Probability 


4495268., cl 84 218 . | 2138 110 


333 


|512 


4.9e-49 



"Protein name 



Locus Name 



terredoxin L 3 Fe - 4 S 



pir:PHAV 



Description 

ORF Name 
|46 93768_il_ll 
Protein name • 

Description 

. REDUCTOI SOMERASE ). 



Acc# 

A29936 :A00 
218 



NTID 



AAID 



NT 
Length 



435^ 



AA 
Length 
11308 



Score Probability 



|2.8e-85 



Locus Name 



sp:DXR_ECOLI 



Acc# 

P45568 :P77 
209 



123 



ORF Name 


NTID 


AAID . ■ ■ 


NT " 
Length 


AA 

T 7T\ , ' Score 
Length 


Probability 


4772325_cl_69 


| ^0 , 


1 2140 


93 : 


282 |77 | 


|0.007i 


Protein name 










Locus Name 




Acc# 


cytochrome Jd ' \ 




gp:ASA228475 


AJ228475 


Description 














Andricus solitarius cytb gene . * - : 




ORF. Name 


NTID 


AAID 


NT 
Length 


' * AA . 

— , Score 
Length - - - 


Probability 


|5109626_tl_6 5 


221 


■ 2141 • 


81 


246 355 


2.1e 


-32 


Protein> name 










Locus Name 




Acc# 












sp:MT!A_MORBO ; 


P34 72 0 


Description 
















METHYLTRAN^FERAat: 


MBOI A) 


(M.MBOT A). 














ORF Name , 


NTID 


; AAID 


NT 
Length 


AA 

. Score 
Length. 


Probability 


|5350281_c3_139 


222. 


2142 ■ 


75 : " 


231 






Protein' name 










Locus Name 




ACC# 



Description 
INO-HIT ■ 



ORF Name 
p23312_t3_37 



NTID AAID 



NT . ' 
Length 
63 



AA 

x Score 
Length — - 



Protein name 
Description 
[NO -HIT , ~ 



Locus Name 



Probability 



Acc# 



ORF Name 
|103187_i:2~5" 



NTID AAID 
2 24 | 12144 



: NT, 
Length 
9 8 ' 



AA ' 
Length 
.1297 " 



Scp_re.. Probability 



Protein name 
Description 

NO-HIT 



Locus Name 



Acc# 



124 



ORF Name 


NTID 


AAID 


NT ; 

Length 


AA 
Length 


' Score 


Probability 


16917 t2 4 


225 




2145 


. 164- 


. 495. 


. F ; 1 


|2 ,6e 


- Z 1 • 


Protein name 












Locus Name 




Acc# ■ 














sp : CYST_EC0L± 


P16701 - 


Description 




















SULFATE TRANSPORT - 


SYSTEM PERMEASE 


PROTEIN CYST 










ORF Name ' 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


20163930__cl_9 






2146 


271 


813 


502 ' 


|5.6e 


-48 


Protein name 












Locus Name 




Acc# 














sp : RHLBHAEIN 


P44922 


Description 




















ATP -DEPENDENT RNA 


HELICASE 


RHLB HOMOLOG 












- ORF Name 


NTID 


AAID 


NT 
Length 


' AA 
Length 


Score 


Probability 


24257755 Cl 8 


227 




2147 ( 




468 ; 








Protein name 












Locus Name 




Acc# 


Description 




















NO- HIT ; 


- , • • ••«. 

ORF Name 


NTID' 


AAID 


• NT 
Length 


AA 
Length 


Score 


Probability 


lS5941S7_tl_5 ■ 


228 . 




2148 


510 


1533 








Protein name 












Locus Name 




Acc#. 


Description 1 




















NO- HIT • * . ' ' 


ORF Name" " 


"NTID 


AAID 


NT 
Length 


aa' 

Length 


Score 


Probability ■;> 

• ■ — 7 


22897253_r2_6 


|229 




2149 


269,. 


•810 ■ ' 


: 305. . 


|4 . 2e- 


-27 



Protein name 



Locus Name 



putative acyltransterase 



gpiSCMlO 



Acc# ,; 
AL133469 



Description - 
Streptomyces coelicolor cosmid M10. 



125 



ORF Name 



NTID 



AAID 



24485937 t3- -13 



12150 



NT AA 
Length Length 
62 



Score 



T¥7~ 



Probability 
1.7e-09 • 



Protein name 



Locus Name 



glutamate dehydrogenase 



gp:UAW010746 



Acc# ■ 
AJ010746 



Description 



Antarctic 


bacterium TAD1 , 


dhe gene . 










ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


2501562_t3_ 


_9 231 


2151 


288 | 


I s67 1 


547 





Protein name 



Description 



Locus Name 



sp:t M rSH_ECOLI 7 



Acc# 
P28691 



CELL; DIVISION 


PROTEIN FTSH, . 










ORF Name 


. NTID AAID 


NT 
Length 


• AA ' 
Length 


Score 


Probability 


25415636_11_4 


232 . 2152 


579 


2040 


[1148 | 


|7.4e-181' 



Protein name 



Description 



Locus Name 



sp :HTPG_ECOLI 



ACC# 
P.10413 



PROTEIN '■' 






... . ,„ T . 






ORF Name . 


NTID - AAID , 


NT 
Length 


AA 

T — L1 ' Score' ■ 
Length • 


Probability . :. ' 


2636668S_c2^24 


233 


2153 . 


791 


2375 |1520 | 


|T..5-e-156 



Protein name 



Locus Name 



penicillin -binding protein 1A 



gp:PAU73780 



Description 



Acc# . 
U73780' 



Pseudomonas aeruginosa penicillin-binding prot 
cds, and malic enzyme gene, partial cds. 


em ; 1A (ponA) gene , complete ' 


' " NT AA 
ORF Name ' NTID AAID — , . = — . '. Score Probability 

, ■ LenyLlr . Length — — . J r 


12304661_jt2_18 234 ■ [2154 |584 


1755 | |763 | |1.2e-7b ■ .v 



Protein name 



Locus Name 



sp : RE<^N_EC0LI 



Description ■ , > 

DMA REPAIR PROTEIN RECN (RECOMBINATION PROTEIN NJ 



Acc# 

P05824 :P76 
602 



126 



ORF Name 



NT ID AAID 



16578133 c3 57 



NT . AA 
Length Length 
F5 ~1 1199 



Score 



ED-[ 



Probability 
0.013 



Protein name 



Description 



Locus, Name 



jsp : PSBKjrObAC 



Acc# 
Q40519 



PH0T05Y5TEM 11 


10 KD POLYPEPTIDE 


PRECURSOR 


(P1110J 






ORF Name. 


NT ID AAID 


NT 
Length 


AA- . 
Length 


Score 


Probability 


|19bk4bl0_t2_17 


| |236 | 2156 


|194 


|bBb | 


444 | 


|7.8e-42 | 



Protein name 



N-tormyimetnionylaminoacyl-tRNA derormylase , 



Locus Name 
(pir:S23107 



Description 



ORF Name 



, Acc# 

S23107:S41 
694 :A49696 
:B65121 



NTID AAID 



23554538 t3 29 



2157 



NT ' AA 
Length Length 
285 



■ Score 



Probability 
4.7e-51 ' " 



Protein name 



Locus Name 



beta-ketoacyi-acyl carrier protein synthase 

ii •'; \ . ■ . '"• • • 



gp:AE188707 



Acc# 
AF188707 



Description 



Phptobacterium protundura acyl carrier protein (acpP) gene, partialcds ; 
beta-ketoacyl-acyl , carrier protein . synthase II (fabF) gene, complete cds; and 
aminodeoxychorismate lyase (pabC) gene, partialcds.: * 


ORF Name 


"NT - AA 

NTID AAID — r- — - * Score Probability 
Length Length - 


j23912502/tl_9 . 




2158 . 90 


273 200 5.6e-lfc. | 


Protein name 






Locus- Name /' / Acc# 



sp : YHHP_ECOL T 



P37618 



Description , ■ - 

HYPOTHETICAL 9.1 KD PROTEIN IN FT3Y-NIKA INTEMENIC REGION 



12 7 



ORF Name . ■ NTID AAID 


NT 

Length 


AA ' 

r — Score 
Length — — 


Probability 


23985753J:3_27 .-• 239 - 2159 


.167 


b04 273 


l.Oe-23 


Protein name 






Locus Name 


. Acc# 








gp:KCU2 83 77 


.TJ28377 


Description 










Escherichia coli K-12 genome; approximately 


65 


to. 6 8 minutes . 




ORF Name ■ , NTID AAID 


NT. 
, Length 


AA 

T — ^ Score 
Length 


Probability . 


|24302263_£1J> |240 . | |2160 


I 1 " 1 


|b«2 | |340 | 


|8.2e-31' 


Protein name ' 






Locus Name 


Acc# 


hypothetical protein t>2948 


pir :C65080 


C65080 


Description 1 , 










ORF Name ' NTID AAID 


NT 
Length 


. AA 

^ — x , Score 
Length 


Probability 


24353458_t2_l>0 . . 241 2151 


308 


927 |671. | 


16 . 9e-66 , . 


Protein name ' 






Locus Name 


■ Acc#. 


site-specitic recombmase 




gp:AF0334^; 


AF033497 



Description ■•".'<. 
Proteus mirabilis site-specitic -.recomomase (xerD). .gene, completecds ; 



ORF Name 



NTID ■ AAID 



24642562 t2 13 



NT 
Length 
102 



' AA ' ' i ' " 

— ; - ' Score " Probability 
Length ~ — 



Protein name 

Description 

NO-HIT 



1309 I 



Locus Name 



Acc# 



ORF. Name 



NTID AAID 



3007832 12 19 



NT . - 
Length 
T£T9 



AA 
Length 
1510 



Score Probability 



Protein name 
Description 



Locus Name 



Acc# 



128 



ORF Name 



NTID AAID 



36205013 El 23 , 



2164 



NT • ' AA 
Length Length 
T£T~ I 11085 



Score Probability ■> 



1.3e-25" 



Protein name 



Locus. Name 



hypothetical protein 



pir:G75388 



Acc# 
G75388 



Description 
ORF Name 



NTID AAID 



13953593. ci 39 



PT^5" 



NT 



AA . 
— Score 
Length Length . — — — 



] 



Protein name. 



Locus Name 



Probability 
|l.le-35 

Acc# 



lmidazoleglycerol -phosphate synthase 



pir :D69070 



D69070 



Description 



ORF Name' 



NTID AAID 



488342b tl 2 



2166 



NT • AA 

Length , Length 
2TT6 



Score Probability 



|1.4e-19 



Protein name 



Description 



L'ocus Name 



sp:YQIA-ECOLI 



Acc# 
P36653 



HYPOTHETICAL 21 


6 kd PROTEIN IN BAW4tJ.CC INTERGENIC 


REGION 


(P193) 


ORF Name! 


NT AA ' 
NTID .AAID — ; ^ , 
Length Length 


Score 


Probability 


6506_tl_3 


247 2167 637' 1 1914 


2041 


4.6e-211 ^ 



Protein name 



. Locus Name 



topoisomerase IV. subunit 



gp :AB003429 



Acc# 
AB003429 



Description 



Pseudomonas aeruginosa DNA tor topoisomerase IV subunit, completecds . 



ORF Name 



NTID 



AAID 



805180 cl 38 



2168. 



NT :AA 
Length Length 
222 



Score 



5"5¥" 



Probability 
1.7e-53 — 



Protein. name 



Locus Name 



sp?HIS-7_PEA 



Acc#. 
Q43072 



Description 

IM2DA20LEGLYCEROL- PHOSPHATE DEHYDRATASE, (iGPD) 



129 



ORF Name 


NTID 


AAID 


NT 

Length ■ 


AA 
Length 


Score 


Probability . 








134 




405 ' 






Protein name 










Locus Name 


Acc# 


Description 
















NO-HIT 


ORF' Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


862761_cl_43 


250 


2170 


72" 




219 






Protein name 










. Locus Name 


" ■* Acc# 


Description 
















NO-HIT •■ • • 


ORF Name 


NTID 


s AAID .: 


NT 1 
Length 


AA 
Length 


Score 


Probability 


12281888_ci_40 


|251 


2171 


7.8 




23 7 






Protein name 










Locus Name. 


Acc# 


Description 








i 

! 








NO-HIT - . • ' .. ' 


ORF. Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


' Probability 


1357177_t2_15.. - - 


252 ■ 


* 2172 


304- 




915 ; 


574/ 


3:3.e-66 • 


Protein name 










Locus Name 


Acc# 



spiGALU^ECOLI 



P25520 



Description '"■ ; ' 

URIDYLYLTRAN5FERA5E) (URIDINE DIPH0 5PH0GLUC05E PYR0PH0SPH0RYLA5E) 



ORF Name 



NTID AAID' 



-144-6-3877 vt 3 23 



NT 
n 



AA Score 
Length .-. Length — — 



\TTT 



Probability 
l.Oe-23 



Protein name 



Locus Name 



sprYJGO. ECOLI 



Acc# 
P39341 



Description 

HYPOTHETICAL 39 . 8 KD PROTEIN IN PEPA-GNTV ..INTERGENIC REGION (0361) 



130 



ORF Name 



NT ID AAID 



NT 



AA 



TTTT 



Length . Length ' 
T73 



■ Score ' Probability 



5T7~ 



Protein name 



Description 



Locus Name 



Acc# 



[NO-HIT 



ORF Name 



NTID AAID 



15040927 c2 50 



2175 



NT AA 
Length Length 
112 



Score Probability 



Protein name 
Description 
[NO -HIT — 



Locus Name 



Acc# 



ORF- Name, 



NTID ..AAID 



NT • AA 

— ' — , ■ .' Score Probability 
Length Length — • — — — - ^ 



[1661018:4 c2 54: 



mr 



] E 



569 I |4.4e-55 



Protein .name 



Description 



Locus Name 



'sp : TESB^ECOLI 



Acc# 
P23911 



ACYL-COA THIOESTERASE . 11 , 



ORF Name 



NTID 



AAID 



■ NT AA 
Length Length 



16819827 t-i.-6-- 



7T77T 



TTT 



Protein name 
Description i 
[NO -HIT* . jT 



Locus Name ?■ 



Probability 



Acc# 



ORF Name 
[1953188 5_c,2_5 7 



NTID . 



AAID 



NT AA 
Length Length . 
^73 " 



Score Probability 



TTTT" 



Protein name 
Description 
[MO-HIT — 



Locus Name 



Acc# 



131 



ORF Name 



NTID 



AAID 



NT- AA 
— ~ — . Score 

Length Length - — — — 



Probability 



19S3838B ±2 14 



75 



|73 [ |0.016 



Protein name 



Description 



Locus Name 



[gpT 



SMI240618 



Acc# 
AJ240618 



Streptococcus, mitis xpt gene, strain 12261. 



ORF Name 



NTID AAID 



Score Probability 



209429^6, 13 ,27 



Protein name 



Description 



NT AA 
■ Length Length 

j |37£ | [1131 | [1060 | |4.1e-107 1 

Locus Name . Acc# 



sp : GALE_BAC3U 



P55180: 



GALACTOSE 4-EPIMERASE) 



,ORF Name 



NTID 



AAID 



21575051- ±3 28 



2181 



NT ■ , : AA 
Length Length 
321 . i; 



— . , ■ Score 



Probability 
3.8e-42 ~ 



Protein name 



Description 



Locus Name 



sp:YRFI_ECOLI 



Acc# 
P45803 



HYPOTHETICAL 32.5. KD PkOTEIN IN MRCA-t>CKA INTEPGENIC REGION 



ORF Name. 



NTID 



AAID 



23557930- c3 61:.; 



NT AA 
Length , Length . 

] EHZ 



1850 ■ 



Score Probability 
[1758 | |3.9e.-182 ~ 



Protein name 



Locus Name 



glucosamine synthase 



E 



p:AP032884 



Description 



Acc# 

AF032884 :L 
77909 



Thiobacillus lerropxidans; N-acetylglucosamine- l-phosphateur idyl trans t erase 
(glmU) gene, partial cds ; glucosamine synthase (glmS) and RecG (recG) genes, 
complete cds; and transposbn Tn5468 , complete sequence. 



13 2 



ORF Name 



NT ID 



AAID 



23634680 12 18 



NT . AA 
Length Length 
TTl 1 11272 



Score 



Probability 
2.3e-35 — 



Protein name 



Locus 1 Name 



putative UDPrglucose dehydrogenase 



bp:ALW243431 



Acc# 
AJ243431 



Description 



Acinetobacter Iwollii wzc, wzt>, . wza, w.eeA; weeB, wceC , wzx, wzy,weeD, weeE, 
weeF, . weeG, weeH, weel, weeJ, week, galU, ugd, pgi/galE, pgm (partial) arid 
mip (partial) genes (emulsan biosyntheticgene cluster), strain RAG-1. 



ORF Name NTID AAID 


NT , 
Length 


: AA 

v — ± \- Score 
Length 


Probability 


2440025.0_t3_24 264 2184 


8.60 • ■ 




2583 1162 


6 : 4e-118 . 


Protein name 






Locus Name 


• Acc#. 








sp : PLSB HA111N 


P44857 - 


Description 










GLYCEROL- 3 - PHOSPHATE ACYLTRANSFERASE , ( GPAT ) 








ORF Name . NTID AAID' 


NT 
Length - 


AA Score 
Length 


Probability 


24493801 13 .30 ' . 265 2185 


. 375 




1128 ; |489 | 


|1.3e-46 ... : 


Protein name 






Locus Name 


Acc# 


FauI'DNA methyl trans t erase 






gp:AF029070 :. 


AF029070 


Description ; " • : - : V 



Flavobacterium aquatile Faul DNA methyl transferase' (tauiM) gene , complete* 


ORF Name NTID 


\ NT ' AA • . 
AAID — - — - Score Probability 
Length Lenyth 


26797302_c2_55 | 266 


2156 | 393;- | 1182 ■ |515 | |2.3e-49 



Protein name 



Locus' Name 



sp:YAIW_ECOLI 



Acc# 
P77562 



Description ■ •" 
HYPOTHETICAL 40 ;4 IN. SBMA-DDLA INTEftGENIC REGION 



s ORF Name NT ID 


AAID 


NT 
' Length 


AA' 

; . Score 
, Length 


Probability 


3317260_:ti_5 . | 267 


2187 


b.73 




1722 1505 


p.9erlb4 


Protein name 










Locus Name 


Acc# 


putative phosphoglucose isomerase 








gp:ALW24343i 


H. TO A 1 A 1 n 
>\U Z <± J ft J 1 


Description ? 














Acmetpbacter Iwottn wzc, 


wzb, wza, weeA, 


weeB,. wceC, wzx, 


wzy,weeD, weeE, 


weeF, ( weeG, weeH, weel, . weeJ, weeK, 


galU, ugd, 


pgi,galE, pgm 


(partial) and . 


mip (partial) genes (emulsan ' biosyntheticgene cluster), strain RAG- 1 . 


ORF Name NT ID 


AAID 


NT 
Length 


AA * 
, — . , Score 
Length 


1 

Probability 


i9Jtt76a_t2_2!4- 268 


2188 


71; 




2 


16 |71. | 


10.026 
1 


Protein name 










1 Locus Name. . 


Acc# 


transcription regulator nomolog yoz 


G. . 






pir:C69931 


•. C69931 


Description 














ORF Name NTID. ■ 


AAID 


NT 
Length 


AA. ' 
■ • Score. 
Length 


Probability 


6729635_c2_46 - 269 


2189 


171 




516 ■ |94 | 


10. 0062 
1 


Protein name. 










Locus Name 


ACC# 


hypothetical protein C45H4 


.14.: 






« pir : T32'/22 


T32722 


Description 












. . . 1. 

-j 


' ORF Name ■ - NTID, 


AAID. 


NT 
Length 


AA 

, ' "- Score ... 
Length 


Probability 


976387_t2_l> , 270 


2190 


88 


• 


267 |74 | 


|0.0025 . 


Protein name 










Locus Name , 


Acc#,, . 


nypotnetical protein T16L4 


170 






pir:T09929 :■ 


\ T09929 


" '• , . ' . *t" 
Description 














ORF, Name . , NTID ' 


AAID :■ 


• NT 
Length 


AA 

- — . , Score, 
Length 


Probability 


iu823462_cij.3 271 


2191 


.67 




2 04 


- '. r. 


Protein name. 


; ■ ■ 








Locus Name 


Acc# 


Description 















134 



ORF Name 


NTID 


NT AA 
AAID r , Score 
Length Length 


Probability 


i2376535_t2_b 


272 


2192 |214 645 |74 


1 10.0011 


Protein name 




' Locus Name 




Acc# 






gp:VCU39068 




U39068 


Description 




* 






Vibrio cholerae 


pathogeni 


city island, partial and complete 


cds . 




ORF Name 


NTID 


_ _ ___ NT AA 
AAID J — . , T " — , , Score 
Length -Length 


Probability 


22065635_12_9 




2193 | |521 |' .; |1566 | |1440 


1 P 2e 


-147. 



Protein name 



Locus Name 



sodium/ proline symporter opuE : proline \ 
transporter opuE ; 



Description = 



pir:H69670 



Acc# 
H69670 



ORF Name 



NTID "AAID 



NT AA 

— . — ■■ Score Probability 
Length Length ~~ — ^ 



124228400 cl 20 



T7T~ 



2194 



1440 



11110 



|2.1e-112 



Protein name , 



Description 



Locus Name 



sp.r'HEMN ECOLI 



, Acctt - 

P32131 :P76 

772 .. 



( C0PR0P0RPHYRIN0GENA3E ) (COPROGEN OXIDASE) . 


•' • " ' NT 
ORF Name * . NTID AAID . 

Length 


AA 

, - — . , Score 
Length 


Probability 


29328457_cl_14 | 275 , 2195 | |98 


|297 | |95 | 


|7.5e 




Protein name ' 


Locus Name 




''" Acc# 




sp:MINE_ECOLI 




P18198 



Description 

CELL DIVISION TOPOLOGICAL SPECIFICITY FACTOR 



ORF Name 



NTID AAID 



4860303 cl 19 



NT AA 
Length Length 
1153 I 



Score 



Probability , 
3.0e-49 " 



Protein name 



Locus Name 



Sp:PTH_HAEIN 



■ Acc# . 
P44682 



Description . 
PEPTIDYL-TRNA HYDROLASE.; (PTH) 



135 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability . 


5835875_t2j4 


277 


2197 


SO 


183 






Protein name 








Locus 


Name 


Acc# 


Description 














NO- HIT ' ■ - 


ORF Name 


NTID 


AAID , 


■ 

Length 


AA 
Length 


Score 


Probability 


9000ii_c3_28 


| 278 . 


2198 


234 . 


l 705 


269 | 


2.7e-23 



Protein name 



Locus Name 



probable noosomal protein L2 5 



pir:H716'6b. 



Acc# 
H71665 



Description 



ORF Name 



NTID 



AAID 



9869006 ti 2 



NT .AA 1 
Length ■ Length 
YTT — "H 1219 ~ 



Score .. Probability 
|27i | |i.7e-23 



Protein name 



Locus Name 



3 OS subunit ribosomal protein S21 



gp :AF014397 



Acc# 
AF014397 



Description 



Pseudomonas putida ma'cromolecuiar 


synthesis' operon : 3 0S'.subunitribosomai , 


protein S21 (rpsU) , DNA primase (dnaG) , ■ and 


sigma- 70 (rppD) genes, complete 


cds . 








ORF Name J ' NTID AAID 


NT 
Length 


v . AA 

..- — '- - 4 Score 
Length - 


Probability 


il88587B_c3_7e v 280 v |2200 , 


|455 


1358 1218 


7.5e-i24 - 


Protein name,. 




Locus Name , 


, . ' Acc# 






|sp:Y164J^±N 




Description ! " , ' 






— ; ■ .P43955 :P43 


HYPOTHETICAL: PROTEIN HI0164/165 ' , ■ 


ORF Name ' *' 1 NTID AAID . 


-NT 
Length 


AA' 

— Score 
Length 


Probability 


12687781_c3_70. 281 2201- 


174 


525 512 


4.9e-49 . . .• 


Protein name 




Locus Name 


Acc# 






. sp: IF3_HAt!lN 


P43 814 


Description 









136 



ORF Name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score Probability 



14093967_cl_49 


292 2202 . 211 


STS 74T" 2.6e-73 


Protein name 




. Locus Name Acc# 


NqrU 




jgp : AF165980 AF165980 


Description ' - " *■ 


Vibrio harveyi 


Na+- translocating NADH-qumone 


oxidoreductasecomplex operon, 


complete sequence. .... 




- ORF Name 


NT AA ■ 
NT ID AAID ; — _ • ^ ;■ Score Probability 
Length Length. ; — t - 


14540908^c3_77 


283 2203 270 


813 |474 | |5.2e-45 


Protein name 




Locus. Name Acc# 


Nqrd 




|gp:AF117331 AF117.331 


Description 


Vibrio cholerae N16 961 7 Na+- translocating NADH-ubiqumoneoxidoreductase 


enzyme complex, 


complete sequence. 




*• * 
ORF Name .. . 


NT . AA 

NT ID AAID , — , , — . , Score Probability 
Length Length — — - — — - 1 — — c - 


15565712 12. 21 


284 . 2204 18 9 ■ 


570 | 162 | 6 .0e-12 



Protein name 



Locus Name 



Acc# 





gp:ECOUW93 




U14003 


Description 






Escherichia cola K-12 cnromosomal region trom 92.8 to 00.1 


minutes . ;, • 



ORF Name 



NT ID 



AAID 



16460432 c'2.-;65 



NT . AA - ' 

Length Length 
78 



Score Probability 



TIT 



Protein name 
Description 
INO-HIT : — ~ 



Locus Name 



Acc# ' 



ORF Name 



NT ID 



AAID 



22038177 13 27- 



2206 



NT 

Tl 

^7 



AA ' 
i T ~~^-u Score 
Length Length . , ■ - 



Probability 
1801 I |1.2e-18!> 



Protein name 



Locus Name 



putative ettlux pump component MtrF 



gp:AF17682I 



Acc# 
AF176821 



Description 



Neisseria gonorrhoeae strain EU75 putative ettlux pump componentMtrF (mtrF) 
gene, complete cds . , 



137 



ORF Name 



NT ID 



AAID 



AA 

• i r ~ i-i. Score 
Length . Length — — - — 



24423250 ci 42 



TOT" 



2207" 



NT 
n 
ffT7 



TTIT" 



T2T" 



Probability . 
2.5e-0b 



Protein name 



Locus Name 



pr2 



gp:MHUi9289 



Acc# 
U19289 



Description : 



Mycoplasma nyopneumoniae J ATCC 2 7219 multidrug resistance protemhomologs 
prl and pr2 genes, complete cds, and 23S rRNA gene, partial sequence. 


■ ORF Name 


NT ID 


AAID 


NT- 
Length 


AA ^ 

Length . 


Probability 


25392778_tl_l 


288. 


2208 


201 


606 | |387 | 


|8.6e 




Protein name 








Locus Name 




•Acc# 


4 -hydroxyphenylacetate 3- 


■monooxygenase (EC 


|gp:D90737 




D90737 :AB0. 
01340 


Description 












Escherichia coll 


genomic 


DNA. (22.8 


-23.1 


mm) . 






ORF Name 


NTID 


AAID 1 


NT 
Length 


AA ■ '•• ■> 
— Score 
Length • 


Probability 


31268B37_i:3_28 . 


j 289 


| 2209 


1412 "\ 


1239 | |1835 | 


12. 4e 


-189 ' j 


Protein name 








Locus Name . 




Acc# 










sp:CATA__HAHlW 




P44390 


Description 














CATALASE , - - " " . • \, 


• ORF Name 


NTID 


, AAID 


NT ' 
Length 


AA 

. , —. : Score 
Length, 


Probability 


33223291_12_19 


290 


,2210 


71 


pi. | , 






Protein name 








Locus Name 




Acc# . 


Description 








■ , 






NO-HIT ■ • ■- • • . ■ .. • 


. ORF Name 


NTID 


AAID ;•; 


1 ...NT 
Length 


AA 

Length ■ ;- ? 


Probability 1 


33233937_c2_E>9 ■ • 


. . 291. 


|2211 


782 


2349 ■ 1415 | 


1\ Oe 


-144 • • 


Protein name 








Locus 'Name 




Acc# 



sp:VACB_ECOLT 



Description 
VACS PROTEIN 



P21499-.P76 
800 



ORF Name 



NTID 



AAID 



33867132. tl 12 



TFZTF 



. NT AA u.. 

Length Length 
225 



Score - Probability 



678 



Protein name 



Description 



Locus Name 



Acc# 



[NO-HIT 



ORF Name 



NTID 



AAID 



3393183 c2 61 



T5T 



TUT 



NT 
n 



AA 

— " Score 
Length Length 



Probability 
|3.8e-123 " 



Protein name 



Locus Name' 



NqrB 



gp:AF117331 



Acc# 
AF117331 



Description 



Vibrio choierae N16961 Na+- translocating NADH-ubiqumoneoxidoreductase 
enzyme complex, complete sequence': 



ORF Name 



NTID 



AAID 



34000785 c3 73 



■25T 



TZTF 



■ NT AA 
Length Length 
61 • 



Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



[NO-HIT 



'ORF Name 



NTID AAID 



341%0S2- c2 63 



1295 



NT, . 



AA . 
" — ~ Score 
Length . Length — ~ 



Probability 



Protein name' 



1251- | |16b0 | |i.2e-165 . 
Locus Name Acc# 



NqrF 



gp: API 173 31 



AF117331 



Description 



Vibrio choierae N16 961 Na+- translocating NADH-ubiquInoneoxidoreductase~ 
enzyme complex, complete' sequence. 



ORF Name 



NTID AAID 



3535043 c2. 58 



NT AA 
Length . Length 
1542 I 11525 



Score Probability 
[2200 | |6.5e- : 228 



Protein name 

Description 
| r (THRRS) " 



Locus Name 



sp:SYT_HAEIN" 



. ACC# 
P43014 



13 9 



ORF Name 



NTID AAID 



4720957 c2 62 



NT AA 
Length Length 
227 



Score Probability 
|679 | [9.8e-67 ~ 



Protein name 

Description 
HYPOTHETICAL PROTEIN HI0168/169 



Locus Name 



sp:Y168_HAEIN 



Acc# 

P43958 :P43, 
959 



ORF Name 



NTID AAID 



[473137 cl .41 



NT v AA 
Length Length 
75 - 



Score . Probability 



7TT 



Protein name 
Description 



Locus Name 



Ac,c# 



NO-HIT ■ . , ... .: • •• 


ORF Name, 


NTID 


AAID .' 


->.> .NT- t 
Length 


AA 

^ — . , Score 
Length . 


Probability 


4801625_12_20 . 




; 2219 


252 


rib 9 622. 


l.le 


-60 


Protein name 










Locus Name V' 




Acc# 












sp:HK4^RH0SH 




P50936 


Description - 1-" 
















ISOMERASE , r " 


ORF Name 


NTID 


AAID.; 


NT 
Length , 


AA V 
. : — , Score 


Probability 








Length 






5882211_12 14 • 


3 00 


. 2220 


ii.8 


357 • 153 - 


2.7e 


-10, 


Protein name 










Locus Name 




Acc# • 


hypothetical prote 


in 1 








pir:S4705i 




S47051 1 


Description 
















ORF Name 


NTID- 


AAID 


NT' 
Length 


AA . • 
; — . ■ , ■ Score 
Length 


Probability 


|682S4i_c2_B5 


II 301 . 


2221 , 


86 


|26i |100 | 


|2.2e 


-05 ■ 


Protein name 










Locus Name ' 




•Acc# '•' , 


hypothetical prote 


in PH0217 • 




pir.:G71244 ; ■ 




G71244 



Description 



140 



ORF Name 



NTID AAID 



14103377 T2 9 



TUT 



TZTT 



NT 



AA 

_ . Score 
Length Length — 



TOT" 



Probability 
I9.0e-41 



Protein name 

— : — — \i ' - . 4 ■;' 

Description "\ •■ 

(EC 2.4.2.-;. ( MON O F UN G T I ON AL TGASE) 



Locus Name 



'sp:MTGA_ACICA 



Acc# 
024849 



■ ORF Name 


NTID ■ 


AAID 


. NT , 
Length 


AA ' 
-f — , Score 
Length 


Probability 


16973437_c3_30 


303 


| 2223 


79 l ■ 


p..40 | 




Protein name 








" Locus Name 


Acc# 


Description 












NO-HIT - ■ ' • . ""■ • . 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — . ■• Score 
Length 


Probability. 


19745308_ii_3 


304 /, 


| 2224 




201 | 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT ■ 1 ' ... 


ORF Name : 


NTID 


AAID 


NT 
, Length 


AA 

: — Score 
Length 


Probability 


24261257J:3ii4 


■ 305 , 


222.5 . 


| 123 


372 |140 | 


f .Oe-09 



Protein, name 



Locus Name 



sp : PNCB_SALTY 



ACC# 
P22253 



Description . , 

N I COT I NATE MOS&HORiBOSVLTRMSPERASEv (NAPRTASE), 



ORF Name 



NTID 



AAID 



25600925 11 2. 



12225 



NT AA 
Length Length 
91 ■ 



Score Probability 
|98 | | 9.9e-05 ~ 



Protein name - \< ■ 

Description 

(EC 2.4.2.-) (MONO FUNCTIONAL TGASE ) 



Locus Name 



sp:MTGA_AC±CA 



Acc# 
024849 



141 



ORF Name . 


NT ID 


AAID 


NT 
, Length 


AA 

-j- ' — . , Score 
Length, 


Probability 


30659433_C2_21 


307 


222'/ 


68 


■ 207 




Protein name 








', Locus Name 


Acc# 


Description 












NO-HIT • * , ;. n - i: 


ORF Name 


NT I D 


AAID 


.NT . 
Length 


AA- . ■'■ ' 
T , Score 
Length 


Probability 


3303178_tl_l 


| 308 


. 2228 


179 


540 |430 | 


|2.4e-40 



Protein name 



Locus Name. 



solanesyl" diphosphate synthase 



|gp:AB001997 



Acc# 
AB001997 



Description •• •• 

Rhodobacter capsulatus DNA tor, solanesyl diphosphate synthase , complete cds . 



ORF Name 


NTID 


AAID 


NT 
. Length 


AA 

— *' Score 
Length 


Probability 


35i82887_c2_23 ... 


| 309 


2229 


| 191 


575 584 | 


2.9e-57 


Protein name 








Locus Name 


Acc# 










■ jsp i ^PYRJtlAEIN' 


P44529 ' 


Description 












PHOSPHO -HYDROLASE) 


(PPASE) 






. ■ i 




ORF Name. 


•NTID 


AAID 


" ' .- NT 
..Length 


AA 

T ~, i Score 
Length . ••- . 


Probability 


35125555_tl_4. 


.310 


2230, 


374. ■ 


1125 |1283.| 


|9 . 7e-131 



Protein name 



Description 



Locus Name 



spiMOCJIAEIN 



Acc# 
P43875 



PHOSPHOLYAiJE} 


ORF Name 


NTID 


AAID 


NT 
Length 1 


AA 

+ — . , . Score 
Length 


. Probability 


5834702 _t2_ll 


ipii 


| 2231 


U2 


489 | 371 


4.3e-34 


Protein name 








Locus ' Name 


Acc# . 



sp:YCHJ_HAEIN 



Description 
HYPOTHETICAL PROTEIN HI0277 



P44609 



142 



ORF Name 
|88253£_cl_i5 



NTID 



AAID 



IJTT 



12232 



NT AA 
Length Length 
1258. I [T7T 



Score 



WTT 



Probability 
B'.7e-39: — — 



Protein name 



Locus Name 



lipoate biosynthesis protein B 



gp:AP147448 



. Acc# 
AF147448 



Description 



Pseudomonas .aeruginosa, strain PAOl penicillin-binding protein 2 ipJopA) , 
rod- shape -determining protein (rodA) ; membrane -bound lytictransglycosylase 
(mltB) , rare lipoprotein A (rlpA) , penicillin-binding protein 5 (dacA) , and 
lipoate biosynthesisprot'ein B (lipB) genes, complete cds ; and unknown gene . 



ORF Name 



NTID AAID 



973756 c3 34 



TUT 



■\ NT AA 
Length Length 



— — Score Probability 



TTT 



Protein name . 
Description . 



Locus Name 



Acc# 





ORF Name ■ NTID AAID 


NT AA 
— - , — , Score 
Length Length 


. Probability : 


?7&05'5_t2_i0 314 | 2234 


745 . 2238- |2349 | 


I 1 - le 


-243 


Protein name 




Locus Name 




Acc'# ■ 


polyphosphate kinase 




gp : ACRBDOXN " 




, .Z46863 . ■' 

i.. 


Description ■ ■ 










Acinet.obacter sp. cysD, cobQ, . sodM, 


lysS> rub A, 


rubB, estB, oxyR,ppk, mtgA, 


0RF2' and 0RF3 genes . ' 










ORF ■ Name NTID AAID 


NT AA 
'— i . — Score 
Length Length , 


Probability 


10673by7_rl_4 ■ ai5 ; | p3b 


402 1209 |1210 | 


|5.3e 


-123 


Protein, name 




Locus Name 




■ Acc# 


■ ! 




sp : TYRB_ECOLI' , 


* P04693 


Description 










AROMATIC -AMINO- ACID AMINOTRANSFERASE , (AROAT) (ARAT) ' . .. 



i. 

143 



ORF Name 


NTID 


AAID ; 


TvTT 
In X 

Lenqth 


AA ■ 

, — . , Score, 
Length 


Probability : 


14572 15 2_tl_l 


IP 16 


| 2236 


250 


783 | 586 


7.0e-B7. 


Protein name . 








Locus Name 


Acc# 










sp:YCIK_ECOLI 


■ J P31808:P77 


Description . 










.516 


(EC 1. -.-:-.) , - • - ; , ■ ; ; 


ORF Name 


NTID., 


AAID 


NT 
.Length 


AA' , . ' 
— Score 
Length 


Probability 


20i2638.6Jt2_8 ' 


317 -, 


(2237 


198 


■■; 597 325 


3.2e-29 


Protein name 








Locus Name 


' Acc# 










sp : YTFL^ECOLl 


P39319 


Description 












HYPOTHETICAL 49.8 


KD PROTEIN IN CYSQ-MSRA INTERGENIC REGION 




ORF Name . 


NTID 


AAID 


NT 

Length 


. AA . . 
* — ■ Score 
Length 


Probability 


|2509375__c2_2? 


318 ■ 


1 2238 .. 


92 | 


. |279 | : 




Protein name 








Locus Name , 


. , . . . , Acc# \ 


Description 












NO-HIT • '• ' ; 


■ ORF Name 


NTID. 


AAID 


NT ; 
Length 


'■■ AA 

; ..— , Score 
Length. 


Probability 


26 808317 J:2jS , 


319 ■ 


| 2239 


232 


599 , 575 


8.1e-56 


Protein . name' 








Locus Name 


: Acc# ;. 














Description 










' P17993 :P76 
924 


METHYLTRANSFERASE ) 


ORF Name 


NTID . 


AAID 


NT 
Length 


AA 

. - — '-Score . 
Length 


Probability , 


|34394050_t3_15 . 


| 320 . 


2240 


|308 


|927 |889 | 


|5.5e-89 ; | 


Protein name 








Locus Name 


Acc# 










pp:YWL_BCOLI 


P3931? 



Description 



HYPOTHETICAL 45.8: KD PROTEIN IN CYSQ-MSRA INTERGENIC REGION 



ORF Name 



NT ID AAID 



3911568 12 7 



TIT 



ttht 



NT AA 
Length Length 
1233 



Score 



Probability 
14 .5e-23 



Protein name 



Description 



Locus Name 



sp:GE>HC ALCEU 



Acc# 
P408 52 



PHOSPHOGLYCOLATE PHOSPHATASE / CHROMOSOMAL; (PGP) 



ORF Name 



NT ID AAID 



AA 

t — • i . . — Score 
Length - Length . =- 



1411719^ c2 2b 



TIST 



NT 
nc 

TUT 



Probability 



Protein name 



Locus Name , Acc# 



leucine aminopeptidase 



|gp:PPtT010251- 



Description 



AJ010261 



. Pseudomonas" putida pepA gene. 



ORF Name 



NTID AAID 



4144318 13 IF 



TIT 



TIZT 



■ NT AA 
Length Length 
362 " I *. 11089 



- Score Probability 
|766 | |5 .9e-76 



Protein. name 



Locus ' Name 



probable lonxctransporter 



pir :F70819 



Acc# • 
F70819 



Description 
ORF Name' 



NTID ' AAID 



4976550 tl 3 



TIT 



TIZT 



NT . ' AA ' 

Length Length 
312 



Score Probability 



|4.6e-37 



Protein name 



Locus Name 



Acc# 



sp:YBHD_EGOLr 



Description 

HYPOTHETICAL TRT^CRIPTIONAL. REGULAT OR I N MQDC-BIOA INTERGENIC REGION , - 



P52696 :P75 
761 



NT AA • 
ORF Name NTID AAID T - — - __. Score Probability 
■ — • ■ Length Length — - • . 1 



1441017 cl -3« 



TIT 



TIZT 



^T 



YITT 



1.2e-07 



Protein name 



Locus Name 



opacity protein opa51 



Ipir: 536329 



Description 



. Acc# • 

S36329:S28 
628 



14 5 



ORF Name 



14452827 c3 53 



' NTID 



AAID 



NT 



12246 



[sir 



AA 
rigt 
T7TT 



Score Probability 
Length Length — — - — : — 1 



II .Oe-25 



Protein name 



Locus Name 



ribosomal protein S15 



pir :S38B82 



Acc# 
S38882 



Description 
ORF Name 



NTID 



14494026 c2 50 



TIT 



NT AA , , . , . 

AAID ■ • — - " — , Score Pro bability 

• Length Length •■ - — — : ! 

~m — i w&y- 



2247 



[5W 



|9.1e-4S 



Protein name. 



Locus Name 



sp:HISl_bACSU 



Acc# 
034520 



Description 
ATP PHOSPHORIBOS YLTRANSFERASE / 



ORB; Name 



NTID 



AAID 



145096-82 c2 45 



2248 



NT AA : 

Length Length 
165 ,;. 



* Score Probability 
2!jQ | 



T77e=T9 



Protein name 



Description 



Locus Name 



gpryCU3 9068 



Acc# - T 
U39068 



Vibrio cholerae pathogenicity island, partial and complete cds . 


NT ■'" AA 

ORF Name - • NTID AAID — ' • — • 

. -.. . , . ■ Length Length 


. Score. Probability 


Ib77.65.c2 48 ■ V29 ! 2249 . 96 291. 


±82 | 4. be -14 



Protein name 



Description 



Locus Name 



sp:YRPM_ACICA 



• Acc# 
P33 98 9 



, HYPOTHETICAL 9 


•2. .KE> PROTEIN IN RPOH-MDRA IOTERGENIC 


REGION 


(0RE5) . 


ORF Name ; 


"* ■ ■ NT AA 
NTID AAID — , — ; 

Length Length 


Score 


Probability 


1651093^c2_52 


550 2250; . 525 . 15^8 


545 : 


| 1.6e-52 | 



Protein name 



Description 



Locus Name 



sp:P™B_ECOLi 



Acc#-, 

P144 07:P7 8\ 
139 » 



FUMARATE HYDRATASE CLASS I, ANAEROBIC, (FUMARASE) 



ORF Name 



|2344593i_c2_51 



Protein name 



NT ID AAID 



PIT 



12251 



• NT AA 
Length Length ■ 
1 [OF? 



Score Probability 
,1.3e-94 



Locus Name 



histidinol dehydrogenase 



pir :E70368 



Acc# 
E70368 



Description 
ORF Name 



23650253 c2 49 



NTID AAID 

]EB_ 



2252 



NT ■ i AA 
Length Length 
421 I . 11255 



Score Probability . 
11337 I |1.8e-136 



Protein name 

Description 
TRANSFERASE) (EPT) 



Locus Name 



sp :MURA_ACICA 



Acc# 
, P33986 



ORF Name 


NTID ' 


AAID 


: NT 

Length 


AA - 
— - Score 
Length' 


Probability 


2381950_c3_58 


|3 3 3 


(2253 . 


1. 61 . 1 


| m | 




Protein name 








Locus Name 


Acc# 


Description 












WO7HIT. . f . 


ORF Name 


NTID 


AAID 


>;• NT 
. Length 


AA 

- — , . Score 
Length 


Probability . .; 


23867127_ci_4i 


|334. 


2254 


275 


■ 828 . 147; 


' 5.2e-10 . 


Protein name- 








Locus Name 


Acc# 










sp:YRAP_ECOLI - P45467 


Description 












(0191) ■ - . v; 


ORF Name 


NTID 


AAID ' : 


, NT 
Length 


AA 

■■■ Score 
Length 


'Probability 


2439ii637_ L ci_40 


1 P*. b 


2255 


149 - 


• 450 - . |143 


|6.2e-10 



- Protein name 



Description ' \ 
HYPOTHETICAL ■ 14 . 8 KD PkOTUIN IN AGAI- 



Locus Name 



sp:VRAM_ECOLI 



Acc# 
P4 54 65. 



MTR. INTERMENT C REGION (0131) 



147 



ORF Name NTID AAID ■ 


NT 

iieng en 


■AA 
Length 


Probability 


34010260_tl_l . U'-^ • 1 2256 - 


115 


, 560 1204 1 

__ I 1 


2.1e 


^lb 


Protein name 




Locus Name 




. Acc# 


general stress protein homolog ykzA 




pir:F69870 




F69870 






— ! i 1 — ! 






ORF Name NTID AAID 


NT 

.Length' 


AA . 
Length 


Probability 


5079188_t3J35 . 337 2257 


163 


492 461 


1.2e 


-4i 


Protein name 




Locus Name 




Acc# 


hypothetical; protein 




gp:ASA'22 4767 




AJ224767 


Description • s ., 










Acinetobacter. sp. ADP1 Ion gene and 


OUFs. 










, ORF Name NTID. .AAID 


NT 


AA score 


Probability 


p30087_c3_61 - | |338 | |2258 | 


I 370 1 


(ill? | |922 


|1.7e 


-92 


I 


Protein name ' 




Locus Name 




Acc#. 






sp:Hiy«_ACEXY 




P4 53 58 


Description 










PHOSPHATE TRANSAMINASE ) . , - 




. .. ORF Name '■ ' ' NTID. . AAID 


NT 
Length 


AA ' ' ' 
T — ' Score 
Length 


Probability 


9b4837_c2__44. 339' '2259,-. 




2100 ; 2198 ' 


1 ile- 


-227 • 


Protein name ' 




(l Locus Name 




Acc# 


polyribonucleotide nucleotidyl transferase 


gp:PPY18132 




. Y18132 


Description 










Pseudomonas putida rpsO and pnp genes... 




ORF Name " NTID AAID 


.NT 
Length 


AA 

, — . , Score 
Length 


Probability '-' 


969392_fl_13 | .340 . 2250 


\ Ti 


.222 






Protein name 




Locus Name 




Acc# 



Description 
NO-HIT 



148 



ORF Name 


NTID . AAID 


NT 
Length 


AA 

. , Score , 
Length 


Probability 


1070165_c3_42 


341. | 2261 


72 


.219; 




Protein name 






Locus Name 


Acc# ■ 


Description 










NO-HIT 'r " ' ' ■; 


ORF Name 


NTID AAID 


NT 
1 Length 


., AA 

, — . , Score 
Length 


Probability 


1099375u_ri_2 


1 342 2262 


137 ( 


414 >• 




Protein name ; 






Locus Name 


Acc# 


Description 










NO-HIT 


ORF Name . 


NTID. AAID 


NT AA 
- ~ , . — Score 
Length Length 


Probability 


|2Q884fc77;_c3_43 


| |343 |2263 


| ,60 | 


|1683 | . |1389 


|b.7e-142 


Protein name 






Locus Name . 


Acc#. 


probable apyl- 


•CoA dehydrogenase 




\ pir:B75282 


B75282 


Description 










ORF Name 


NTID AAID 


NT 
Length 


AA ' . 
, -. , Score 
Length 


Probability' " 


24395191_cl_31 


| 344 2264 


| 97 


294 |71 | 


,|0.011 


Protein name 






•i Locus " Name 


Acc# 


conserved hypothetical, protein aq_123,6 


pir:F70406 


■ | F70406 . 


Description 










ORF Name 


NTID AAID 


NT ; 
Length 


AA 

Score m 
Length . ■■■ 


, Probability '- 


3380468u_ci_35 


. 34b. 2265 


■7.96, 


2391 |709 | 


|4.4e-72 . • I 


Protein name , 






Locus Name 


Acc# • 


site-speciric recombinase - 


gp:NGU82253 


U82253 



Description 



Neisseria gonorrhoeae site-specit ic recombinase Igcr) gene , complete cds . 



149 



ORF Name 



NT ID • 



AAID 



3.408516b 13 ^0 



NT 
Length 



AA 
Length 
114 91 



Score 



[1327 | j 



Probability 
|2.ie-135 



Protein name 



Description 



Locus Name 



'sp:fcPSD_PSEAE 



Acc# 
P26480 



RNA POLYMERASE 


SIGMA FACTOR RPOD 


ISIGMA-70) 








ORF Name* 


NT ID AAID 


NT 
Length 


AA 
Length 


Score 


Probability . 


35823506_C2_32 


347 | 2267 


|.b!4 


I 1545 


J1343 


4.3e-137 



Protein name 



Locus Name 



Butyryl-CoA: Acetate Coenzyme A transferase 



: CTACTAGEN 



.Acc# 
Z69031 



Description 



O. thermosaccharolyticum actA gene . 



ORF Name 



NTID ■ AAID 



35939753. c2 ' 34 



2268 



NT 
Length 
73 



— ,, ■ Score 



AA 
Length 
XZTI 



Probability 



Protein name 



Locus Name 



probable acyl-CoA dehydrogenase 



pir:B75282 



Acc# 
B75282 



Description 
■ ORF Name 



1391 7 193 c2 33 



NTID.,, AAID 
1349 



; NT 

Length 
9B 



AA "■ 
Length 
|288 ' 



Score 



[±TT 



Protein name 



Locus Name 



probable acyl-CoA dehydrogenase 



pir:B75282 



Probability 
|2.9e-09, ' 

,Acc# 
B75282 



Description 
■ORF Name , 



NTID . AAID 



3954817 cl 27 



T5TT 



NT 
Length 
159 



' AA 
Length 

1480 



Score 



Probability 
2. 9e-35 



Protein name 



Locus Name 



probable acyl -CoA dehydrogenase 



pir:B75282 



Acc# , 
B75282 



Description 



ORF Name 



NT-ID 



AAID 



5167157 c2 



est 



12271. 



NT,. ... AA, 
Length . Length 
161 



Score 



im- 



probability 
|B.4e-06 — 



Protein name 



Locus . Name 



hypothetical protein PH1801 



Description 
ORF Name 



pir :A71191 



. Acc# 
A71191 



NTID AAID 



NT 



Length , Length 



AA . ' 
— , ' Score 



Probability 



9923125_c2_40 


352 


2272 


73 


222 




Protein name 










Locus Name 


Acc# . 


Description 




1 










NO-HIT . \ 


ORF Name 


: NTID. 


, AAID 


. NT 
Length 


AA . 
■ — , Score 
. Length . 


Probability 


103.554 3 7 J; 2_5 


353 


| |2273 


| 147 


|444 . | 159 


2;:0e-ll 


Protein name . . 










■Locus; Name 


Acc# • 












Sp:THID_HAEIN 


P.44697 


Description 














(HMP-P- KINASE) 












i 


ORF Name 


NTID 


AAID 


■NT- 
Length 


AA 

„ — . Score 
Length 


Probability 


23912S27_c3_lu. 


354 


22.74 \ 


79 


24.0 ' , 




Protein name - : 










Locus Name ' 


Acc# ■ 


Description 












i. . 


NO-HIT ■ . " . " - . 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 1 

- — ; ; ; 1 Score ". 
Length 


Probability 


35^67912_c3_ll 


355 


2'Zlb 


306 . | 


. 921 ... |483 | 


|5.«e^46- 


Protein name 










. Locus Name 


ACC# 










) . 


sp:PR0C HAEIN 


P43869 


Description . 















(P5CR) (P50 kEbUCTA^E) 



151 



ORF Name 


NT ID * 


AAID 


NT 
Length 


AA 

t -TXi-u Score 
Length 


Probability 


4062840^_c2_9 


3 56 


1 12276 


1191 


576 ;l 12 06 1 


|1 .3e 


,16 - 


T)Y~/~\t~ i Ti ramp 








locus iNcime 




Acc# 










sp:YGGT_HAEIN 




P44097 


Description 














HYPOTHETICAL PROTEIN H11036 * . 


ORF Name . 


NT I D 


AAID 


NT 
Length 


AA • ' 
. — . , Score 
Length 


Probability 


|1016406i_c2_87 


' 357 . ■ 


| |2277 


242 | 


|729 | 425 | 


8.1e 


-■40.-. 


Protein name 








Locus ; Name , 




Acc# 










sp : YAtIB ECGLI 

» T" 




P28634. 


Description 














HYPOTHETICAL -26. 


4 KD PROTEIN ; IN J 


PROS-RCSF INTERGENIC REGION 


(ORF3) 




ORF Name \ 


. NT ID 


AAID 


NT 
Length 


AA 

— . . Score 
Length 


Probability 


14558812_c3_97- 


358 


| 2278 


425 , 


1281 |287 | 


|5.5e. 


-37 


Protein name 








Locus Name 




ACC# ■ 


probable l lipD protein 






- pir :G70634 




G70634. 


: ..... 

Description 














. ORF Name ... 


. . NT I D 


AAID 


. . NT 
,-\ 'Length 


AA 

, Score 
Length 


Probability 


|14901512_c3^_103 




| 2279 


1 lb6 1 


|4 71 | |210 | 


14 . 9e-l7 . , 


Protein name 








Locus- Name 




• Acc# 










Sp:HIT_BACSU 




007513 \ 


Description 














HIT PROTEIN ..... 


ORF Name 


NT ID 


AAID 


: NT 
Length 


AA 

t ~ ¥- , \ Score 
Length ■ 


Probability : 


164813 J? 3^52 " 


1350. 


| 2280 


|431 


' 1295 |1415 | 


|7.8e- 


-145 . 


Protein name .-■ 








» Locus Name 




,Acc# 










■' |gp:AB025342 




AB025342 


Description 















Moritella marina genes ~ complete cds , similar to eicosapentaenoicacid 
synthesis gene cluster. . '■. " • ' ■ .■ 



152 



ORF Name 



NT, ID 



AAID 



17068763 ±1 15 



TST" 



2281 



NT AA 
Length Length 
337 . | 11014 



Score 



11048 



Protein name 



Description 



Locus Name 



sp:HEM2_PSEAE 



Probability 
|7.8e.-106 — 

Acc# 
Q59643 



SYNTHASE) 


(ALAD) 


(ALADH) 












ORF Name 




NTID 


AAID 


NT. 
Length 


... AA 
Length :! . 


Score ' 


Probability 


23444400_c3 


J2 


|362 


2252 


336 | 


I 1011 1 


|iibi , 


iy.4e-H7 



Protein name 



Description 



Locus Name 



sp:RUVB_ECOLI' 



ACC# 
P08577 



H0LL1DAY JUNCTION DNA HELIcASE RUVB - 


ORF Name ' " NTID AAID . 


NT 
Length 


AA n - 
— ■ , Score 
Length . , 


Probability 


23526552_c2_83 . 363 . | 2283 


422 


1269 316 


2.9e-28 


Protein name 






Locus Name 


Acc# 


conserved hypothetical protein yueF 


pir:G70007 


G70007 


Description ' _ 


■( 








■ ■ 'i ■ , 
ORF Name " ; NTID AAID ' 


NT 
Length 


AA 

. — .Score • 
Length 


, Probability, 


23b9bpi_11^17 364 . | |2284 


7B3 


2352 , 2265 | 


.4e-235 


Protein name ' < • ' 






.'Locus Name : 


Acc#\. 


hypothetical protein D2463 




pir:F65021 


F65021 


Description 










ORF, Name- NTID AAID ' 


NT 

Length 


AA 

T - — . , Score 
Length 


Probability 


23828428_13_57 , - 365 . 1 2285 


272 


S19 | . 250 | 


1.6e-40 , | 


Protein name 






Locus Name 


" Acc# 


aldoketoreductase 


gp:AF001865 


AF001865 



Description 

Leishmania mexicana amazonensis aldoketoreductase (PTR-1) gene , complete cds . 



153 



ORF Name 



NTID 



AAlD 



24250012- cl 66 



2255 



NT AA 
Length Length 

] EZ^Z 



IT7TT" 



Score 



11104 



Probability 

iy.0e-ii2 — 



Protein name 



Locus Name 



glycine betaine transporter BetL 



gp: API 02 174 



Acc# 
AF102174 



Description 



Listeria monocytogenes glycine betaine transporter BetL TBeTTTgene^ complete 
cds . : 



ORF Name 



124315512 12 37 



Protein name 
Description ' 
QKP7ITT 



NTID AAID. 



NT .• AA' 

x — *_u ' T Score 
, Length Length - — 



. Locus Name 



Probability 



Acc#- 



ORF Name 



24317157 13 55 



Protein name 
Description 

. NO-HIT ~~ 



NTID • AAID 



NT AA - 
— - — ; Score 
Length Length — 



JEW 



175 , 



5^TT 



Locus Name 



Probability 



Acc# 



ORF Name 



2517175 tl 18 



Protein name 
Description 
[NO-HIT • 7 



NTID, .AAID 



: NT . AA . ■ 

, . ~ — , Score 
Length . Length — 



12 28 9 



75T 



Locus Name 



Probability 



Acc# 



ORF ..Name 



29376681 tl 1 



Protein name 
Description 

NO-H-XT ~ 



NTID.' -AAID 



T7TT 



12290 



NT ' AA .. 

Length Length 
F4" "1 'Vt>i 



Score- Probability 



Locus Name 



Acc# 



154 



ORF Name 


NT ID 


AAID . 


NT 

Length 


A TV 1 

T — . Score 
Length 


Probability 


30360452_£I_6 


371 


2291 


80 


243 




Protein name 


' * 






Locus Name . 


. : :Acc# 


Description 






- 






NO-HIT ' 








\ . . 




■■ ORF Name 


NTID 


AAID 


. NT , 
Length 


AA 

, — . , Score 
Length 


Probability 


pt06<5.25i7_c3_106 


If 7 * 


2232 


473 


±440 514 


3.0e-43 


Protein name 








Locus Name 


Acc# ' 



sp : -ACRE- BCOLI 



P24180 



Description •, . 

ACRIELAVIN RESISTANCE RROTE IN E PRECURSOR (ENVC PROTEIN) " 



ORF Name 


NTID 


AAID ' : 


NT 
Length 


AA 

— , Score 
Length 


Probability 


3I423232_c2_«ti 


|373 


|2233 


308 


327 327 


■ 2 . Oe- 


-29 •; 


Protein name 










Locus Name 




Acc# 


hypothetical protein Rv0241c, 




pir :E70938 '■ 


E70938 - 


Description 
















■ ■ '■' ■ . 1 ■ 
ORF* Name 


NTID • 

- • . 


■ AAID 


NT ,.- "■ 
Length 


AA 

. — . , .Score 
Length 


Probability 




|374 


2234 




215 | 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT ' . . • 


. >. ORF Name 

.- r- 


. NTID 


AAID 


NT 
Length, 


AA • ' " • 
, — . ,. Score' 
Length 


Probability 


3411043S^r2_30 


|37b 


|2295 


116 ' | 


pbi | pa | 


|0.030 


Protein name 










Locus Name ■ 




,Acc# 


microtxlarial. 


sneatn protein SHPi , 






gp:LSD545b6 


U54556 



Description 



Litomosoides sigmodontis micror ilarial sheath protein SHP3a (shp3a) and 
microfilaria! sheath protein SHP3 (shp3) genes , complete Neds. 



155 



ORF Name ; 



NTID AAID 



14147193 t2 29 



2295 



Protein name 



di hydroxy- acid dehydratase , 



Description 



NT • AA 

Length , Length 
163 5 I .11908 



Score 



urn - 



Locus Name 



pir : DWECDA 



Probability 
|3.4e-242 



Acc# 

A27310.:D26 
570 :S48894 
:S30669:F6 



ORF Name 



NTID AAID 



4350088 c3 96 



T7T 



TZTT 



NT AA 
Length Length 
TSF 1 11377 



Score 



Probability 



Protein name 



Description 



Locus Name 



gp:MLCB1883 



■ Accj 
AL022486 



Mycobacterium leprae cosmid, B1883 . " < -\ 


ORF . Name 


NTID AAID 


NT 
Length 


AA :■■ 
t — ■ i ■ Score 
Length. 


. Probability 


4381318_t3_56 . 


378 ; • 2298 


|250 


... 753 585 . 


9.0e-57 


Protein name 






Locus Name 


, Acc# - 








sp:CCA_ECOLI 


P06 961 . 


Description 










{ TRNA , CCA- P YROPHOS PHORYLAS E ) ( CCA- ADDING ENZYME) . 


ORF Name " 


NTID .' AAID 


NT 
■ Length 


AA . - 
- ; . — Score 
Length , . 


Probability : • ..' 


4712537_cl_60 


379 2299 


[117 


P 54 1 




Protein name . 






• Locus Name 


- Acc# 


Description ' 1 










pJO-HIT . 


ORF Name 


NTID AAID 


, NT 
Length , 


AA 

■■ — Score 
Length ■ -■ - 


Probability 


4769050 _c2 79 


380 ■ 2300 


r i 


|300 | (117 | 


|3.5e-07 


Protein name 






Locus Name 


Acc# 


hypothetical protein. APE03 95 




• pir:B72732 


B72732 • 



Description 



156 



' ORF Name 


NTID 


AAID 


"NTT 

Length- 


' AA n • 

T — . , Score 
Length 


Probability 


5266540 tl 8 


381 


2301 


219 


6 6 0 | 




Protein name 








Locus Name 


Acc#. 


Description 












NO-HIT; 


ORF Name 


NTID 


AAID 


NT 
Length 


AA ■ 

^~ Score 
Length 


Probability 


6250012_tt_7 ■ 


382 


2302 


313 


942 ' 552 




Protein name 








Locus Name 


Acc# 


terredoxm^ -NADP+ 


reductase, , 




pir:A57432 . 




Description 










= — 1 A57432 :A53 

96 7 


ORF Name 


NTID 


AAID 


NT . 
Length 


AA 

\ — , , Score 
Length 


Probability,. 


6697266_Cl_62 


383 


2303 


\ 78 


237 • 




Protein name 








• Locus Name 


Acc# 


Description 








"V. 




NO-HIT ■ ... ■. .' -i ■ ■ ' • • - " - . ' 


ORF Name 


NTID ' 


AAID 


' NT 
* . Length 


AA , 

* v ; Score 
Length : 


Probability 


|68171$l_c2_89 


384 


|2304 


|- 575, | 


'2525 | ■ 2815 


|3..5e-293 | 


Protein name . 








Locus Name 


. Acc#' ' . ■ 



Description " .' 



sp:YHlV_ECOLi 



P37637 



HYPOTHETICAL 


111 15 KD MOTE IN IN 

'• • . ■.-»-.. 


HDED-GADA 






ORF Name 


NTID AAID 


NT 

Length 


AA ' • 

— : . Score 
Length 


Probability ■ 


781302_e3_58 


38b 2305 


185 . 


■ 558 541: 


4.1e-52 ; | 



Protein name . 

Descrirrtion ' . • . 

HYPOXANTHINE PH03PH0RIB0SYLTRAN3FERA3E, (HPRT) 



Locus Name 



Isp : HPRT_EC0LI 



Acc#. 
P36766 



157 



ORF Name 



NTID 



AAIE) 



100305 c3 168- 



NT . AA 

Length Length 
1251. 



Score 



[751T 



^T8~ 



Probability 
9.8e-51 



Protein name 



Description 



Locus Name 



Isp : VmWJZCOLl 



Acc# . 
P46852 



HYPOTHETICAL 26.3 KD PROTEIN IN GNTR-GGT INTERGENIC REGION 


CF231) 


ORF. Name NTID AAID ■ " — , 

Length 


■AA 
Length 


Score 


Probability 


10604658 _t2_36 387 2307 488 


|146V 


|V0b 


|1.7e-69 



Protein name 



Locus Name 



RdxB 



|gp:RSU6 7 86 2 ! 



ACC# ; _ 
U67862, 



Description 



Rhodobacter spnaeroides rdxB and rdxH genes, complete cds, and ccoPand rdxl 
genes, partial cds. ' \ \ 



ORF Name 



NTID 



AAID 



12509836 F2 57 



3W 



NT . AA ' 

r — . -i T — Score 

Length Length 

137 I 1414 I [T7fr- 



Probability 
|1.2e-13 " 



Protein name 



Locus Name 



hypothetical protein R186.1 



(pir:T24235 



Acc# 
•T24 2-3 5 



Description 
\ ■ ORF Name 



NTID ; 'AAID 



1272201 c3 159 



NT , AA' 

Length Length 

1169 \ 



Score Probability 



^7T 



|1Q9 | |8 : 2e-05 



Protein name 



Locus Name 



hypothetical protein SPAC869 . 06c 



pir:T39117 



Acc# 
T39117 



Description . 
ORF Name 



NTID AAID ' 



NT AA 
• — , ■ — , Score Probability 
Length Length — — - — L • 



113080050 11 26 



0.021 



Protein name 



Locus Name 



put 



|gp:^TAPOO0O0i 



Description ; 



Acc# 

AFOOOOOl :A 
F013957 



Salmonella typhi topoisomerase B (topB) , single strand bmdingprotein (ssb), 
Ytl2 homolog (ytl2) genes, complete f cds ; pil operon, complete sequence; Rci* 
(rci) gene, complete cds. 



158 



ORF Name 



NTID 



NT 



AA 



13723751 C3 176 



AAID . T . , _ — . Scor e Proba bili ty. 
~~ Length . Length ~ 1 

TFTT 



TTTT 



[FIT 



11*57 



|1.4e-138 



Protein name 



Locus Name 



FixNd 



Acc# 
Z80339 



Description 



R. legummosarum rixNd and. lixOd gene 


s • 1 


■ ■ - * '\ 




ORF Name 


NTID AAID 


NT 
Length 


aa: 

T — A v Score 
Length 


Probability 


140_tl_li . . 


392 | |2312 


144 


435 




Protein name 






Locus Name 


Acc# 


Description 










NO-HIT 


ORF Name 


NTID AAID 


• NT 
Length 


AA 

T — , Score 
Length ■ 


Probability. 


I4«5025i_ci_i25 


|393 | |23i3 | 


243 | 


732 J ' |3S5 


|i.4e-3b | 


Protein- name 






Locus Name 


ACC# 



. . Description 



005029 



HYPOTHETICAL 


PROTEIN HI 0 6 72 










,0RF Name . 


NTID AAID , 


• NT 
Length 


AA 
Length 


Score 


- Probability • 


i5«26-i_i2_53 


394 2314 


. IbS ■: 


477 


522 


4.3e-50 •■ - 



Protein name 



Description . 



Locus Name- 



lsp:RL13_HAEIM 



ACC# , 
P44387 



505 RIB050MAL 


PROTEIN LI 3 












ORF Name 


NTID 


■ AAID 


. NT 
Length 


AA. 
Length 


Score 


Probability 


ibU59456_t3_74 




| 2515 


ipt--. 


1291 


I 105 I 


|6 .be- 0b: 



Protein name 



Locus Name 



hypothetical protein PH0639 



ir:H71108 



JpiriH? 



Ac'c# ' 
H71108 



Description 



159 



ORF Name NTID AAID 


NT ' 


— Score 

T .on rr t~ V» 


Probability 


16803811 J:I_I3. ' ||396 | 2316 : 




*i ■ |« V | 


|0 . 040 . | 


Protein name 




Locus Name 


Acc# 


somatostatin sst2B receptor 




gp:RNSST2B 


X98234 


■■ - ■ ■ ■ ■ 
Description -•■* . 


' ... l - 


■ 




R.norvegicus mRNA tor somatostatin 


receptor 








ORF Name . NTID AAID 


NT * ' 

In 1 

, Length 


AA 
Length" 


Probability 


16US3590_c3_164; 397 |2317 


2J3 


702, ■ 265 


V.ie-2J . | 


Protein name 




Locus Name 


Acc# 






sp:YEAZ_ECOLI 


- 1 P76256:O08 


Description ' 






476:008477 


HYPOTHETICAL 2b .2 KD PROTEIN IN F ADD -P ABB INTERGENIC REGION 




- ORF Name ' NTID AAID 


NT 
Length 


AA 

T — , , Score! ... 
■ : Length 


Probability 


I95633i2_c2_137; ' 398 - |231S 


96 


|291 | |71 , | 


|0.038 . | 


Protein name 1 

■"•»."••*■■ . j ■ 




. " • Locus Name 


Acc# 






. sp : Y YAB_B AC S U 


P37523 ■ 


Description 








HYPOTHETICAL 17.0 KD PROTEIN IN SPO0J-GIDB 


INTERGENIC REGION. 






ORF- Name . • NTID AAID 


NT ■ 
Length 


AA i • 
Length 


Probability . 


19632661_13_91 399 | 2319 


134 = 


405 




Protein 'name . ... ; 




Locus Name "* 


Acc# 


Description 








NO-HIT • 1 - . ; . 




ORF Name NTID , AAID 


NT 
• Length 


AA 

,' "— . , ■■- Score 
Length 


Probability 


2UiS77_cI_95 |400 1 2320 


7B1 | 


2256 2566 


|i:le-266 


Protein name . 




Locus Name 


' Acc# 






sp:CLPB_HAEIN" 


P44403 • 


Description 








CLPB PROTEIN 






160 , 







ORF Name 



121985931 13 55 



Protein name 



NTID AAID 



NT 



AA 



Length Length 



Score Probability 



| 401 


2321 


|2il 636 


*** | 



Locus Name 



sp-:UCRI_CHRVl 



Ac,c# , 
.031214 



Description , : ; 

[-RIE3KE -IRON- SULFUR- PROTEIN) (RI3P) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score : - 


Probability 


2206665i_r2v40 


402, 


|2322 


191 




364 


2.4e-33 


Protein name 










Locus Name 


Acc# . 












sp:YAJCl_HAEIH 


P44096" 


Description, 
















HYPOTHETICAL PROTEIN HI1034 , ■ i 


ORF Name 


NTID 


AAID 


NT : 
Length 


AA 
Length 


■. Score 


Probability 


23525307_c2_1.46 ■-. 


403 


2323 


213 


542-. 


554. 


3. be-by ■ | 


Protein, name 










■ Locus Name 


• ; , • Acc# 


cytochrome -c oxidase, type 


cbb3 cha 


in - tixQ^ 


pir:S77b96 , ; 


S77596 


Description 
















ORF Name 


. NTID 


AAID 


NT 
Length 


AA 
Length . 


Score 


Probability 


23720002_c2_140 


404; 


|2324 ... 


51. 








Protein name 










Locus Name 


Acc# '.. 


Description 
















NO-HIT " ; 1 \, 


ORF Name'. 


NTID 


AAID 


NT 

Length 


AA 
Length 


Score 


Probability 


23560S81_t2_39 - 


1 405 


|232S | 


455 ■ - 


1365 


, P17 |, 


|6.4e-19y , 


Protein name '! 










Locus Name y 


' ACC# 



Description 
L10ASE) . , 



sp:ASijV_HAkliN 



P44315 



161 



ORF Name 



NTID 



AAID ' 



123854180 tl 18 



TOT" 



12325 



NT 

n 

T7T 



AA ' , 
T — ~ Score 
Length Length ■■ 



Probability - 
Lie- 21 ~ 



Protein name 



Locus Name 



CorE 



|gp:AF130857 



Acc# 
AF13 0857 



Description 

Salmonella typnimurrum cobalt resistance locus, partial sequence. 



ORF Name 



NTID 



AAID 



NT . AA 
r ■ T " Score 

Length Length — — — 



23947151 tl 19 



TUT 



TTTT 



TUT 



TTT 



Probability 
|2.2e-07 



Protein name 



Locus Name 



unknown 



AF147448 



■ Acc# - ' 
AF14 7448 



Description 



Pseudomonas aeruginosa strain PAOl ' penicillin-binding .protein 2 (pbpA) , 
rod- shape -{determining protein ' (rodA) , membrane -bound lytictransglycpsylase . 
(mltB) , rare lipoprotein A (rlpA) penicillin-binding protein 5 (dacA)', and 
lipbate biosynthesisprotein B (LipB) genes ,• complete cds and unknown gene. 



ORF Name ■*. • 


NTID 


AAID 


NT 
Length 


AA 

- — . , ■ Score 
Length 


Probability 


24083208_t3\jB2 , | 


|408 


2328 


71 1 


216 






Protein name 








Locus Name ' 




Acc# 


Description' • . ' 














NO-HIT • ' , ; \. • 1 . 


ORF Name 


NTID 


AAID 


NT 

Length 


a!a 

. — , Score 
Length • — . 


Probability •' 


2427IB75_cI_i22 


409 


| 2329 


5S8 


. 1677 . |1857 | 


|1.5e 


-191 , 


Protein name 








Locus -Name 




Acc# . 










sp,: PYRGJHAEIN 




. P44341 


Description 














CTP SYNTHASE, IUTP- 


-AMMONIA LIGASE) 


(CTP SYNTHETASE) .. 






ORF Name 


NTID 


AAID 


.NT 
Length 


AA ' 
" , — :. , Score 
Length 


Probability 


24337827_rl_15 


410 


| 2330 


355 - | 


1068 a|1038 J 


|8 . 9e- 


-105 


Protein name 








Locus Name 




,' Acc#. • 


dihydroorotase , 








pir:T10453 




T10453 


Description 















162' 



ORF Name ■ ■ 


NTID 


AAID 


NT. 
. Length 


i ^ ■ , Score 
Length, 


Probability 


24344i38_i3_68 


1 1 411 


|23 31 


1 70 1 


|213 




Protein name.. 








Locus Name . 


Acc# 


Description 












nO-hIT • w . ; - " 


ORF Name 


NTID 


AAID 


■ NT 
. Length 


AA 

, — , , . Score 
■Length 


Probability 


244'i7B75'_ci_i24 


; 412 


2332 - 


141 


425 • 135 


4.3e-09 


Protein name 








■Locus Name 


• Acc# 










sp: YGBQ_HAEIN" 


P44035 


Description 












HYPOTHETICAL 'PROTEIN HI0673 ; . . r . 


ORF Name . 


NTID 


AAID : 


NT , 
Length 


: AA • ' 

- — . ■ Score . 
Length • . 


Probability 


|245002S6_c3_ : i63 


l 41i 


2333 


II 515 


1550 . [1458 | 


|2.6e-i52 



Protein name ' - 

Description - : , ' . ' 

SIGNAL RECOGNITION PARTICLE PROTEIN 



Locus Name 



sp:SR54_ECQLI- 



Acc# 
P07019 



(FIFTY-FOUR HOMOLOG) ,(P48) 



ORF Name 



NTID AAID 



24648402 11 22 



TJJT 



NT 



" '■ AA • n 

Score 

Length Length . — - 



Probability 
IHF^| |5.4e-E>9 



Protein name 



Locus Name 



probable ,exonuclease , 



pir :T03455 



Acc# 
TO 3 46 5 



Description ■ 



ORF Name 



NTID AAID 



24844552 c3 167 



,' NT . AA 

Length Length 
|548 , 



1547 



Score 



11470 



Probability 
|1.5e-150 



Protein name 



Locus Name 



probable pitB protein 



pir :E70731 



Acc# 
E70731 



Description 



163 



ORF Name 



NT ID . AAID 



129880042 13 51 



F 



NT . AA 
Length Length 
m$ ""I 11458 



— , Score 



Probability ^ 
3.1e-61 



Protein" name 

Description , 
EXONUCLEAyii! a My 



Locus Name 



sp:SBCD_ECOLI 



Acc# 
P13457 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , - Score 
Length 


Probability 


Jlbou^b ij of 






1 \£H 

II 






Protein name. 








Locus Name 


Acc# 


Description 












NO-HIT 










... - 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — , Score 
Length - 


Probability 




Irr _L O 


1 

1 ■ r - • 


jy u 1 


10^1 — 1 12^1 — 1 


2 P^- 1 R 1 
6 . o c io 


Protein name 




i > 




Locus Name 


. Acc# 


hypothetical prot 


ein kW72 




i. 


pir:E71694 


, E71694 


Description 












< : ORF Name 


NTID 


AAID 


, ; NT - 
Length . 


AA ■'. 
' ,— , Score 
Length 


•Probability 




| 41^ 






1255 - 1355 


2.3e-i,58 , 


Protein name 








.;. Locus Name 


Acc# 




■ i 






/= sp:CYB_CHRVI 


031215 ■ 


Description' '■• 












CYTOCHROME B . - . < t ;_ - • ' 


■' ORF Name 


: NTID 


AAID '• 


■■ NT 
Length 


AA ••; o 

, T — ■ -Score 
Length 


Probability 


33707182 il_27 


1 1 420 


2340 




. |7b9 | 364 


2.7e T 46 


Protein name. .. 








Locus Name 


Acc# 



sp:CYl_CHRVI. ; 



031216 



Description 
CYTOCHROME Ci PRECURSOR 



" 16A 



ORF Name 



NT ID 



AAID 



S3-8VbSa5 c3 Ib7 



NT 
Length 
68 • 



AA 
Length 
-1207 



Score, Probability 



Protein 'name 
Description 



Locus Name 



Acc# 



NO-HIT ' — " ■ - ■ ; • „- ■ ' — ■ 


ORF Name NT ID ' 


AAID 


NT 
Length 


AA 

T — . , Score 
Length 


Probability 


3406458i_ci_ii9 | 422 


2342 


87 


264 171 1 


J0.026 , 


Protein name 








Locus Name 


Acc# 


cb- type cytochrome, c oxidase CcoQ 


subunit 




gp:AB024290 


AB024290 


Description 




Magne tospinll urn magne totacticum 
cytochrome c oxidase, complete cds 


CCON, ccoO, 


ccoQ, ccoP gene 


torcb-type 


ORF Name NT ID 


AAID 


NT 
Length 


. AA 

T : — , , Score 
Length 


Probability 


34i2025i_ci_ : 105 423 


2343 


322 - 


969 |S47 |' 


|2.4e-« . . | 


Protein name 








Locus Name ■, 


Acc# ' 



Description 
P0LVPREMYLTRAM3PERASE T 



sp:-UBiA_ETOLi 



P26601 



,ORF Name 


NTID 


AAID . 


NT . 
Length 


.AA 

T — . v , Score 
Length 


■i- 

- Probability 


^379680_c2_127 


424 - 


■ 2344 . 


r 


183, ■ 




Protein name 








Locus Name 


' Acc# 


Description 1 . 








• ' ' ■.' ■ ' ■ 




NO-HIT • ' ■; . ; ; • ■- . 


ORF Name ■ 


NTID 


1 AAID 


. NT - 
Length 


AA 

t— . , Score 
Length 


Probability • 


3$05686_c3_I55 




J 12345 


653; 


1352 2231 


3.4e-231 | 


Protein name 


Locus Name 


Acc# • 



sp:GxDA PSEPU 



P25756 



Description 
GLUCOSE INHIBITED DIVISION PROTEIN A 



ORF Name " NT ID ' 


AAID 


NT 
Length 


AA - 

„ — \ , Score 1 
Length 


Probability 


3932753_c2_149 - . " 425; - 


2346 


IP" 


|2304 |235 | 


|1 .3e-i6 


Protein name 






. Locus Name 


Acc# 








sp:REC2JIAEIN 


P44408 


Description . 










RECOMBINATION PROTEIN 2 


ORF Name NT ID 


AAID 


NT 
Length 


AA ' 
• — , Score 
Length 


Probability 


394231S_12_54 427 


\ |2347 


131 


396 |507 | 


|1.7e-4U | 



Protein name 



Description 



Locus Name 



Accft 
•P31782 



303, RIBO^OMAL 


PROTEIN S9 










ORF Name 


NTID 


AAID 


• NT 
Length 


AA ■ 
T — , , ' Score 
Length 


Probability 


3947193 _t2_56 ; 


42^ 


U 234U 


■in 


399 pil | 


|9.7e-28 


Protein name 








Locus Name 


Acc# 



Description 



sp:SSPB_HAEIN 



P4 52 0.6 



STRINGENT 


STARVATION PROTEIN B HOMOLOfl " 








.. ORF Name 


NT 

NTID AAID ..- ' — 

Length 


. • AA . 
Length " 


Score ' 


, Probability ; . 

'i : i 


4119075_ci_ 


103 ; ,o. 429 | |2349 . : 281 


I 846 


454 


|6.0e-44 ■ . • 



Protein name 



Description 



Locus Name 



sp:feACA_ECOLI 



Acc# J 

P31054 :P3 9 
203 



(EC 2.7.1 


GG) 












ORF Name 




NTID AAID 


NT 
Length 


AA 
Length 


Score 


. Probability 


4334463_c3_ 


172 


430 J2350 


159 | 


. l Bi0 1 


I 78 1 


p.8e-0b | 



Protein name 



unknown 



Description 



Locus Name 



|gp:At'0B39,16 



Acc# 
AF083 916,. 



Rhizobium etli Fnr- type' transcriptional regulator FnrNc (fnrNc)gene, 
complete cds ; and unknown genes. 



166 



ORF Name 



NT ID AAID 



NT ' AA , • ^ . , . 
. l ~^ T _ _ — , Score Probability 
Length Length — — —■ — i 



14798194 c3 ITS" 



Protein name 



2551 | |358 | [1077 [ [358 | |5:6e-48 - " — 

Locus Name Acc# 



cytochrome ^c oxidase, tixP chain : cb- type 
cytochrome- c oxidase 32K chain : cytochrome 
b410:fixP protein. 



pir :D47468 



D47468 



Description 
ORF Name 



NT ID AAID 



1^00017 t3 71 



PIT" 



NT 
n 



AA . • 

— ■ Score 
Length Length . 



Probability 
|4.6e-53 



Protein name 

... ., ■•- » * ^ 

Description ■ 
RIBQNUCLEASE T, {EXOklMOMUCLEASE T) {ENASE T) 



Locus Name 



4 sp:RNT_VIB£>A 



Acc# 
"I P46232" 



ORF Name 
|520003_ci_i2ig 

Protein name 
Description '. 
NO-HIT — ~ 



NTID AAID 



NT AA 
Length Length 
204 



Score ■ Probability 



67 



Locus Name 



Acc# 



ORF Name 



NTID AAID 



5203453 c3:181 



NT AA 
length Length 
1445 



Score Probability 
1338 | [14 £7 | |3.1e-150 ■ 



Protein name 



. Description 



Locus Name 



sp : ENOjECOLI 



Acc# 
' P08324 



GLVCERATE 


HYDRO -LYASE) 












ORF Name 


' NTID 


AAID 


NT 
Length 


AA. 
, Length 


Score 


Probability 












52813i8_c3_ 


180. 435, 


|2355 


290 


873 


J1037 | 


|1. le-104 | 



Protein name 



Locus Name 



j 2-dehydro- 3 -deoxyphosphqoctonate 



aldolase 



Description 



p:AE098791 



Acc# 
AF098791 



Pseudomonas aeruginosa -2 -dehydro- 
complete cds . ' , 



3-deoxyphosphooctonate aldolase (KdsA) gene; 



■167 



ORF Name 



NTID AAID 



NT 



AA 



t ^ t :— Score Probabi lity 
Length . Length — ~- — * 



b901067_cl_104 1436 ; 2356 


1 F 4 


825 


202 


J.be-16 | 


Protein name ' , 




Locus 


Name 


ACC# 






sp : YHIQ_HAeIN 


P44901 


Description ■ 










HYPOTHETICAL PROTEIN HI0849 




ORF Name ■ NTID AAID 


NT 

- Length 


aa', 

Length 


Score 


Probability 


70546b0_ci_118 437 12357 






|53 | 




Protein name 




Locus 


Name . 


Acc# 


ORF-D 




gp lECOlOKLS 


';• D11109 


Description 










E. coll gene tor 10K-L and "10K-S 


protein- 










ORF Name NTID AAID 


NT . 
Length 


AA 
Length 


Score ' 


Probability 


95770b_cl_113 4.38 | 2358 




1008 


|407 | 


|6.be-38' | 


Protein name 




Locus 


Name 


' Acc# •. 


putative regulatory protein * 




|gp:AF087482 


AF0874 82 



Description 



Pseudomonas aeruginosa ■ clcCand. ohbH genes, Lys-R type, regulatoryprotem J • 
(clcR) , chlorocatechol-1 f 2-dioxygenase (clcA) , chlorpmucoriate cycloisomerase' 
(clcB) , dienelactone hydrolase (clcD) , maleylacetate reductase (cl.cE) , 
transposase-, (tnpA) , ATP -binding- protein (tnpB) /' putative regulatory protein 
(ohbR) , o-halobenzoate dioxygenase. reductase (ohbA) , o-haipbenzoatedioxygenase 



■ ORF Name 


NTID AAID 


NT AA ■ 
T r~~ tl . — , Score 
Length .Length 


Probability 


9960917_t3_90 


| 439 -,.. 23b9 . 


223 | 572 | 354 | 


J2 7e-32 | 



Protein name 

Description ' .'-*-.. 
STRINOUNT STARVATION PROTEIN A 



Locus Name 



sp:S5RA_ : EC0LI 



Acc#. 
P05838 



168 



ORF Name ' 
|10532690_jbi_17 



NTID 



AAID 



NT AA 
■ — — , Score 
Length Length ■ 



Probability 
9.8e-99 



Protein- name 



Description 



. Locus Name 



sp :NUON_ECOLI 



.; , Acc# 

P33608 :P78 
281 





CHAIN 14) 










ORF Name 


NTID 


AAID 


NT 
Length 


AA ' 
f — , , Score .. 
Length 


, Probability • 


1059455_ci_85 


| |441- 


2351 




228 




Protein name 








Locus Name 


Acc# . 


Description 












NO-HIT v : ■ ; ; - • . ■ 


ORF Name 


NTID 


AAID . 


•■- NT 
Length . 


AA ■ , •■ 
, .— . , Score 
Length • 


Probability 


10734830_ci_89 


442 


2352 


50 


183 




Protein name 








. Locus Name 


Acc# 


Description ; 












NO-HIT . v ■ 


ORF Name 


■ NTID 


• AAID 


NT 
Length 


AA * 
, • \ . Score . \ 
Length 


Probability, 


138!>390_t3_b4. ■■ 


443' 


2353 


215, . 


651 ' |409. | ■ 


|4 . Oe-38 


Protein name 








Locus Name 


Acc# 










sp : NUO J_ECOLI 


1 P33605:P78 


Description ■ 










, 236 


OXIDOREDUCTASE CHAIN 10 ] 


(NUO10J - 








ORF Name . .. 


■ NTID 


AAID . 


NT , 
Length 


AA ' • 
, — \ Score 
Length 


Probability 


[1386342!>_t2-23-.. 


444 


.2354 . 


1 276 1 


831 | |480 | 


|l,.2e-45 ; 


Protein name 








Locus Name: 


Acc# 


hypothetical protein RP682 ' 


pir:E716 7.4 - 


E71674 ; 



Description 



169 



ORF Name 



NT ID AAID . 



114454827 12 28 



12365 



NT AA 
Length Length 
211. 



Score 



WW 



Probability 
3..1e-54 • 



Protein name. 



Locus Name 



pyridoxamine 5 -phosphate oxidase 



pir:B75513 



Acc# 
B75513 



Description 
ORF Name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score • Probability 



14475702_ cl_90 


446 


2356 -259 


' 780 |9i | |0..0u08i 


Protein ■ name 






Locus Name 


A'cc# 


0£F8 






gp:r>78257 ; 


D78257 


Description - 










■ EnterocoGcus taecalis piasmid pYI17 genes tor. BacA, BacB , ORF3 ; ORF4 , ORFb , 




0RF6, 0RF7, 0RF8 , 0RF9, ORF10, 0RF11 , partial 


cds . . . • ' . 






. ORF Name 


NT 

NTID AAID - , • ' — , 
- • Length 


AA ■ 

— ' , Score Probability 
Length • 


i4578202_tl_12 


447 


2367 182 . 


549 s |763 | |1.2e 


-7,5 . •. 


Protein name ., 






Locus Name 


Acc# , 



sp:NU0T_EC0Ll 



Description ■ .. ■ ' 

.'OXTDOREDUCTASE CHAIN 9) (NU09) " 



P33604 :P76 
488:P78183 



ORF , Name 



NTID - , AAID 



15625443* cl 84 



NT AA 
Length . Length : * 
61 



— Score Probability 



Protein name 
Description 
NO -HIT '■ ; 



Locus Name 



Acc# 



ORF Name 



175760 13 .46 



NTID 

1449 



.AAID 



12269 



■ NT AA 
. Length Length ; 
£T5 — 



Score 



T5T" 



Probability 
|4.4e-32 ; 



Protein name 



Locus Name 



NADH dehydrogenase chain A 



AF057063 



Acctt - 
AF057063 



Description 



Erwima. carotovora subsp. carotovora aspartate aminotransterase (aat) gene, 
partial cds; HexA (hexA) , NADH dehydrogenase chain A(nuoA), and NADH 
dehydrogenase chain .B (nuoB) genes, complete cds; and NADH dehydrogenase chain 
C (nuoC) gene, partial cds. 



ORF Name 


NT ID 


AAID 


"KPT 

Length 


AA 
Length • 


Score 


Probability.. 


19806b77_12_27 


, ,14.0 


| 2370 


| 452 


11359 


.l 1177 l 


|1.7e 


-119 


Protein name 










Locus 


Name 




. ACC# 












sp : MkSA_HAE IN 


P45164 


Description 


















MRS A PROTEIN 


HOMOLOG 


















ORF Name 

■ «» 


NTID 


AAID 


NT 
■ ; Length 


AA 
Length 


Score 


.Probability 


2il0657_tl_3 


451 


2371 


328 


|987 | 


I 760 1 


|2.6e 


-75 


Protein name 


< 








Locus 


Name 


.-» 


Acc# , 



sp:Y92S_SYNY3 



. P72872 



Description ' - ' 

HYPOTHETICAL 37.9 KIT PROTEIN SLLO-926. . 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— . , Score- 
Length 


Probability 


224022b2Jt2_2b 


4b2 


2372 


63. • 


; I92 i :v; 




Protein name 








Locus Name ; 


- : Acc# ■ 


Description 












pro - hi i „ 


. ORF Name 


. NTID 


AAID . 


NT 
, Length 


AA 

— . Score 
Length 


Probability" 


23< ! >8321bJ:2 ,38 " ; 


4b3 • 


2373 


b79 


1740. 1583 


1.6e-162 " " 


Protein name 








" Locus Name 


Acc# •" 



Description 



sp :-NU0M_ECOLl 



P31978 :P78- 
248 



OXIDOREDUCTASE 


CHAIN 13> 


(NU013) 








ORF Name 


NT I D" 


AAID 


NT 
Length 


AA 

— ^- , Score 
Length 


Probability 


|24225213_t3_50 


| 454 


2374 


265 .1 


|801 | |1183 | 


|3.8e-120 | 


Protein name 








Locus Name 


, Acc# - 


Tou2 ' 








|gp:AP058fiS9 


AF058689 


Description, 



Neisseria meningitidis strain Z24yi, genomic sequence. 



17'! 



ORF Name 



NTID. AAID 



24226b02 ci 132 



NT 
Length 
270 .' 



AA 

— , Score Probability 
Length — ~ . ■ ' - — - — ; — = ■ 



STT" 



T5W 



|7Y0e-89 



Protein name 



Description 



Locus Name 



sp:Y572_HAE-IN 



Acc# 
P44758 



■HYPOTHETICAL .PROTEIN -HI0572 e . \ . t 


ORF Name 


NTID 


AAID ■ 


NT, . AA 
— - , — ^- , .Score 
Length Length 


Probability 


'2439ibb7_ri_io | 


45* 


2376 


1046 3141 1655 





Protein name 



Locus Name 



NADH , dehydrogenase (ubiquinone) , I chain 
GmuoK protein 



Description 



pir :A6.5 00 0 



Acc# 

A65000 :S65 
638 :S38316 
:S37064 . ' 



ORF,. Name 



NTID AAID 



24642893 FT lb. 



TTJT 



NT 
Length 
619 . 



AA ' 

— . , , Score Probability 
Length .. — . ■• • ' - -. ■ ~^ L - 



|1.8e-186 



Protein' name 



Description 



Locus Name 



sp:MU0L_EC0LI 



Acc# 

P33607:P78 
254 ' 



OXIDOREDUCTASE 


CHAIN 12) 


(NU012). 








- - i 


ORF Name 


NTID 


AAID 


NT ' 
Length 


AA 
Length 


Score 1 


• . Probability 


2507286_t2_22 , 


4bi3 , 


2378 


213 • 


642 


} ,|770 | 


|2.2e-76 



Protein name 



Locus Name - 



outer membrane protein Bl 



gp:AF04b251 



Acc#. 
AF045251 



Description •/ ' / ' • ' 

Moraxella catarrnalis outer membrane protein Bl gene, complete cds . 



ORF Name 



NTID. . AAID 



2535213b 12 26. 



12379 



NT 
Length 

61 - 



i .— , 'Score Probability 
Length — — 



Protein name 
Description ■ 

IN0-H1T : 



Locus Name 



Acc# 



,172 



ORF Name 



NTID AAID 



25579763 ±3 ST" 



2380 



NT AA , 

Length Length 



Score ■ Probability 



2.3e-3.9 



Protein name 



Description 



Locus Name 



|sp:EENR_EC0LI 



Acc# 

P28861 :P11 
007 ; 



(PLXR) (FLDR) (METHYL VIOLO^EN RESISTANCE PROTEIN A) 


(DAI) 




NT AA 

ORF Name NTID AAID , — . , _ 

Length Length 


Score 


Probability v 


2S228401_c2_i05 461 2384" 156' 471 ■ 


123 


' 8.ie-08 • | 



Protein name 



Locus Name 



hypothetical protein APE1413 



bi-r:B726i$. 



. Acc# 
D72619 



Description 



ORF Name 



NTID AAID 



25688176 ti 1 



. NT. AA 
Length Length 

70- ' 



Score 



Probability 
|4.8e-26 _ 



Protein name 



Locus Name 



transterrin^binding protein 2 precursor 



|gp:AEiQ5251 



Acc# 
AF105251 



Description 



Moraxeila catarrhalis transterrin-binding protein 2 .precursor (ompBl) gene / 
partial cds . 



ORF Name- 



NTID AAID 



NT AA • 
— - — , Score Probability 
Length Length . ^— - — ~ — — 



30082693 13 51 



463 - 



2383 



1476 



11378 



i.3e-141 



Protein name 



Locus Name 



sp:NU0F_EC0LI 



Description 
OXIDOREDUCTASE -CHAIN 6) (NU06) 



Acc# 

P31979 :P78 
239 



ORF Name 



NTID AAID 



30252036 c2 98 



NT AA 
r Length Length 



— — ■ Score Probability 



Protein name 
. Description 

[NO-HIT 



Locus Name 



Acc# 



173 



ORF Name 



NTID AAID 



31283452 tl.li 



NT 
n 



AA 

i t ~4_t- Score 
Length Length ■ : - 



Probability 



1029. 



[1125 [ . 



4.2e-114 



Protein name 



Locus Name 



sp:NUOH_fc]COLI 



Description . 
OX T DOkED UfCTAS E ■ CHAIN 5 } (NU08 ) 



Acc# 

P3360.3 :P78 
307 



: ORF Name 


NTID 


AAID 


NT • -. AA 
Length " Length t 


Score Probability' 


3182067_c3_13i 


|466 


2386 


516' 1551 


| |4.1e 


-203 


Protein name 






■ ' . Locus Name , 


ACC#; ,, 








4 sp :SYR_HAEIN 


P43 832 


Description 












ARGINYL-TRNA 


SYNTHETASE ,, 


(ARU1N1NE- 


-TRNA LIGASE) (ARGRS) 




ORF Name 


NTID 


AAID 


NT AA 
Length Length 


Score Probability 


33723387 £1J> 




. 2387 


235 ■ | |708 \ 


|799 | |1.9e 


-79 


Protein name 






Locus Name- 


Acc# ; 



sp:NU0B_EC0LI 



Description , * 
OXIDOkEfrUCTASE TO 2 f -(NTJ02 ) 



P33598 :P78 

090 ... 



ORF Name 
|33772186J:3_41 



NTID AAID 



12388 



NT - AA 

Length Length 
TTE 1 11251 



Score : Probability 



Protein name 



[1601 | |1.9e-154 :.~ 
Locus Name Acc# 



transternn binding protein B 



gp :AF039313 



AF039313 



Description., 



Moraxella catarrhalis strain LES-1 
complete cds . \ 



transternn binding protein . B (tbpB)* gene, 



ORF Name 



J41769b0 ti 42 



NTID . 



AAID 



NT AA 
Length Length 
548 



Score , Probability 

1647 , 1 rrrr 



2.2e-29 



Protein name 

Description 
HYPOTHETiCAL PROTEIN MJ0170- 



Locus Name 



sp:Y170_:METJA 



■ Acc# ■ 
Q57634 



174, 



ORF Name 



NTID 



AAID 



|34414bb2 FT 47 



..." NT . AA 
Length Length 
1 11755 



Score 



12 1 50 



Protein name 



Locus Name 



NADH dehydrogenase ' (ubiquinone) , I , chain 
CrD 



Description 



pir :D65000 



Probability 
|7.be-227 



.. Acc# 

D65000.-S38 
313 :S38312 
:S€5634 :S6 



ORF Name 



NTID AAID 



NT AA 

'. -— — , Score Probabi lity 
Length Length ■ ■ - - + 



35166075 tl 4 , 



12391 



T9T" 



Protein name 



|885. | |31bi | |3.7e-2a , ~ 
Locus Name Acc#. 



periplasmic chaperone protein 



gp:AF095«45 



AF095845 



Description 



Pseudomonas syrmgae cell division/stress response protein 
periplasmic chaperone protein (lblA) genes, complete cds . 


(ttsK)and : ; ■ " 


NT • AA 

ORF : Name NTID AAID — i — ■ Score 
• LenyLh ( LengLh 


• i • ■ . ' ' 
Probability. 


36i44f87_r3_49 | 472 2392- 52 189. |240 


|3.2e-20 • . 



Protein name 



Locus Name 



sp:NU0D_3ALTY 



Acc# ,„ 
P33902 







'i ■ ■ ■ . - ■ - 




j OXIDOREDUCTASE 


CHAIN 4) (NU04J ; (-FRAGMENT) 






ORF . Name 


. NTID AAID ^ . 

Length 


AA 

— Score . 
Length 


Probability- " 


39i5693icl^0 


473 2393 |416 


1251 ■ 211 


i-.7e.-i4 ■ 


Protein name 




Locus Name 


ACC# 



Description 



gp : ECPMC7A 



X57583 



E.coli Plasmid pMccC7 mccA", B , C , p ; E , F genes. 


ORF Name NTID AAID '. — 

Length 


• ' AA , - ■ 
T ~~, i Score 
Length 


Probability 


4740902_c2_127 474 2394 313 


942 114 


0.00043 


Protein name 


Locus Name ■ 




Acc# 
064252 ■ 




sp : ?RXH_BPM£)2 





Description . 
PUTATIVE NON-HEMtl HALOPEROXIDA^fc! , 



175 ■ 



ORF> Name 



NT ID AAID , 



479&aVb tl b 



' NT . • AA , 
Length Length 
78 



Score Probability 
|4.8e-10 - 



Protein name 



Locus Name 



conserved hypothetical . prptem 



b.ir:H75273 



Acc# 
H75273 



Description 



ORF Name 



NT ID AAID 



NT ■■ AA , 

_ _ ™ — __ ■ Score Probabi lity 
Length Length . — dL 



5097886 tl.14 



] [ 



i.le-28: 



Protein name 



Description 



Locus Name 



sp:NUOK_ECOLI 



Acc# 

P33606 :P76 
487 :P78182 



. OXIDOkEDUCTASE 


CHAIN 11 J 


(NUOii) 






i 


ORF Name 


,NTID 


.AAID 


NT .' 
Length. 


AA 

: — , Score 
Length 


Probability 


7226452_11_9 


4.77 


2357 


174 . 


|525 | |470 | 


|1.4e-44. 


Protein name 








Locus' Name 


Acc# ; 










. sp':MUOE_SALTY 


• . 1 .P33903 


Description 












■ OXIDOREDUriASE 


CHAIN b ) 


lMU05) :i 








. ORF Name . 


NTID 


AAID 


NT 
Length 


. AA " 
■ -r <• . , Score 
Length 


Probability 


10181576_t2_ i 42 


, 1478 


2398 




306 ■ 




Protein name 








. Locus Name 


Acc# 


Description 












NO-HIT ■ ■ • • ,• , s r . • 


ORF Name .'; 


. .NTID 


:AAID . ' 


NT 
Length 


, AA ' . " 
T — • , Score 
Length 


Probability 


10,75i312_ti_7 


|479 


■2399 


939 


, 2820 |710 | 


|2.9e-114 


Protein name 








Locus Name 


. Acc# 



sp : YCBY_HAEIN 



Description 
HYPOTHETICAL PROTEIN HI01i6/115 



P44524 :P43 
945. 



176 



ORF Name 



NT ID AAID 



10975302 cl 93. 



PIT 



2TuTT 



NT AA 
Length Length 
1293 



Score Probability 
1195- I P>.!>e-13 



Protein name 



Locus Name 



probable D, D-carboxypeptidase 



Description 



ORF Name 



19587762 cl 77 



Protein name 
Description 
INO-HIT ~ 



pir:B71353 



Acc# 
B71353 



NTID AAID 



2401 



NT AA 
Length Length 
|270 | 



Score Probability 



"ST 



Locus Name 



Acc# 



ORF Name 


NTID ■ 


AAID 


NT 

Length,. 


AA ;-. ' ' 
— Score 
Length 


Probability 


19735377_t2 j 4 


| 482 


2402 


63 • 


192 | 




Protein name 








Locus Name 


. Acc# 


Description 












NO-HIT v ■•' 


ORF Name 


NTID 


"K 

AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


2i49i07.5_c3j.27 


|483 


2403 . 


517 


1554 |309 | 


|1.4e-4i 



Protein name 



CjaB protein 



Description 



Locus Name 



gp:CJE17971 



Acc# . • 
Y17971 



Campylobacter jejuni cjaB gene. 



ORF Name 



NTID AAID 



21520276 c3 136 



12404 



NT AA 
. Length Length 
|275 J 



Score : Probability 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT 



177 



ORF Name 



NTID 



AAID 



NT 



AA 



Length Length 



Score Probability 



X1603403 c3 126 


485 2405 


543 1532 |857 | 


|1 . 3e-85 


Protein name 






Locus Name 


. Acc# 








sp:YMDC_ECOLI 


P75919 . 


Description 










HYPOTHETICAL - 55 


9 KT> PROTEIN IN C5GC 


-MDOG INTERGENIC REGION 




ORF Name 


NTID AAID 


NT AA 
— , „ — - Score 
Length Length " 


Probability 




| |486 | |2406 


[476 . | 1431 | |1649 | 


[l.&e-lS? | 



Protein name 



Locus Name 



sp : GLNA_AZOVI 



Accft 
P22248 



Description . , ' 

GLUT AMINE SYNTHETASE , ; (GLUT AMATE- -AMMONIA LIPASE) 



ORF Name 



NTID AAID 



NT ' AA 

■ : — , . ■ • — - ■ ■■' • Score 
Length Length • ~ 



22306b32-c3 134 



12407 



Protein^ name 



Description 



Locus Name 



Isp : LP5A_PA5HA 



Probability 
|6.3e-40 ~ 

... ' Acc# , 
Q05770 



LPSA PROTEIN • . , \. ... / . 


ORF Name 

• ■ - - I. • _ . 


NTID 


AAID' 


NT 
. Length' 


,AA 
Length 


Score 


Probability 


22442010_i:l_l 


488 


1 2408 . 


1 M - 1 


, 1065 1 


I 460 1 


|i.8e-42 



Protein name. 



Locus Name 



unknown 



Description 



E 



p:AF116284 



Acc# 
AFi 16284 



Pseudomonas aeruginosa DnaJ-like 
genes . . .. 



protein gene, complete cds ; andunknqwn 



ORF Name 



NTID AAID 



NT AA 
— — Score 

Length Length - — — — 



12375337 t3 49 



Protein name 
Description 
[NO-HIT — 



Locus Name 



Probability 



Acc# 



ORF Name 



NTID • AAID 



23944431 c'A TTZ 



Protein name 



hypothetical protein APE0029 . 



Description 



NT . AA 

Length Length 

m~ — 1 



Score 



Locus Name 



pir:H72734 



Probability 
lbvle-06 



Acc# 
H72 754 



ORF Name 



NTID AAID 



■ NT . AA 

— — - Score 
Length Length • , -.- = 



Probability 



23945931__t3_55 


|491 


| 2411 


3C6 | 


1041 1 135 


1.2e 


-06 


Protein name 








Locus Name 




" ACC#' 


hypo t net leal protein sirll6 6 




pir:S75877 




S75877 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA , n , , 
, Score 
Length . 


Probability 


23954511_11_6 


492 


2412 


811 . •• 


2436 |2745 | 


|1.2e 


-285 | 


Protein name . " 








. Locus Name 




, Acc# ' 










sp : PPS.A_EC0Lf - 




P23538 .• 

•f . 


Description 














(PEP ^NTHA^k') 












r - . , 




ORF Name , 


NTID 


AAID 


NT .,' 
' Length 


AA 

T — ; Score 
Length 


Probability' 


|23989752_cl_84 ' ' 


493 


2413 


| 166 


501 288 - 


l.Oe 


-42 


Protein name - , 








Locus Name 




: ACC# 










sp:3DHQ_NEUCR 




P05195 


Description - ■'■ ■ 














DEHYDRATASE) '/ ' > "•' " = 




ORF Name 


NTID 


AAID 


NT 
; Length 


AA "» 
„ —. y ' Score 
■Length 


Probability 


24306512j 1 c2_99 


494 


2414 , 


202 


609 509 ' 


r..0e- 


-48,, 


Protein name 








Locus Name 




• ACC# 



sp:GCHi_03T6S- 



061573 



Description ' 
GTP CYCLGHYDROLASE T, (CTP-CH-I) 



179 



ORF Name 



24337752 IT VI 



Protein name 



Description 



NT ID AAID 



NT . AA 
Length Length 
T75- — 1 11137 



Score Probability 



Locus Name 



sp:YDAO_ECOLI 



II: ^e- 8-8- 



. Acc#. 

P76055 :Q47 
558 



HYPOTHETICAL 35.6 KT> PkOTEIN IN DBPA-INTR IKfTERGEKfIC • kWGION 



ORF Name 



24646887 11 16 



Protein name 



Description 



NO-HIT 



NT ID AAID 



NT AA 
Length , Length 

16 9 • 



■ Score Probability 



Locus Name 



Acc# 



ORF Name 



24881717 12 TV 



Protein name 
Description 
[NO-HIT . 



NT ID AAID 



NT AA „ • ■ ' . 

— ' . — Score 
Length - Length ■■ 



TUT 



TIT 



Locus Name 



Probability 



Acc# 



ORF Narne^ 



25595262 13 68 



Protein name 



Description 



NO-HIT 



.NTID . V AAID 



4 98 



12418 



NT AA 
Length . Length 
168 ' 



— Score , Probability 



1507 



Locus Name 



Acc# 



ORF Name 



26354750 13 50 



Protein name - 
Description . 
NO -HIT ~~~ 



NTID. ..AAID, 



12419 



NT AA 
Length Length 
60 



Score ; Probability 



Locus Name 



Acc# 



180 



ORF Name 



NT ID AAID 



|2933250^t3_6> 



[575TT 



NT , AA 
Length Length 
IJ01 



Score 



Probability 
B.ie-79 - 



Protein name 



Locus Name 



enoyl- (acyl -carrier protein) reductase 



gp:AF104262 



Acc# 
AF104262 



Description 



Pseudomonas aeruginosa enoyl- (acyl - 
complete cds 


carrier 


protein) 


reductase (tabl) gene, 


ORF Name NTID, ; AAID 


NT 
Length 


AA' 
Length 


Score Probability 


29335786jJl3_46 " 501 / 2421 


249 


750 


428 |3.9e-40 



Protein name 



Locus Name 



unknown 



gp:AF115284 



ACC# 
AF116284 



Description 



Pseudomonas aeruginosa DnaJ-like 
genes . 



protein gene, complete cds a andunJcnown 



ORF Name 



NTID ' . AAID 



2938207b tl 4. 



NT ■ - AA 
Length Length 
TTT 1 



Score 



Probability 
3,0e-40. V 



Protein name 



Locus Name 



probable membrane; protein bl 52 0 



pir:C64?06 



Acc# ', 
C64906 



Description 



ORF Name 



NTID AAID 



3142582b tl 22 



NT AA ■ 

Length Length 
2T0 



Score 



YTTT 



Probability "' 
|1.7e-76 ~~~ 



Protein name 
Description 



Locus Name 



Isp : kPE_HAtllN 



Acc# 
P44756 : 



ORF Name 



NTID 



AAID 



152177 c3 153 



NT AA 
Length Length 



— i-v, V: T ^~i_>." Score .. Probability ' 



Protein name 
Description 
[NO-HIT., ■ -,/ 



Locus Name 



Acc#/ 



181 



ORF Name 


NTID 


AAID 


NT^ 
Length 


AA 

.— . , Score . 
.Length 


Probability' 


3316436_tl_19 


505 


2425 




■ 1389 331. 


7.4e 


-30 


Protein name ; 










Locus Name 




Acc# 












sp:VISG_ECOLI 


P25535. ■ 


Description ' 
















VISC PROTEIN, 




ORF Name 


NTID 


AAID 


NT 
Length 


AA Score 
Length 


Probability . . 


3363282U_r3_62. 




| | 2426 


1 P 49 1 


|750 | |559 | 


5.1e- 


-54 


Protein name 










Lociis Name 




. Acc# 


. , nbose- 5 -phosphate 


isomerase ' 






gp:AF037440 




AF03-7440 



Description 



Edwardsiella ictaluri D-3 -phosphoglycerate dehydrogenase (serA)gene, partial 
cds ; ribose- 5 -phosphate isomerase (rpiA) , inhibitorof chromosome initiation 
(iciA) putative 26 kDa protein (yggE) , putative 30.6 kDa protein (yggB) , and 
fructose 1 , 6-bisphosphateaidolase (fda)' genes, complete cds; and 
phosphoglycerate kinase (pgk) gene, partial- cds . ■ ... . 



ORF Name : NTID' 


AAID . 


- NT 
Length 


■' AA 

. — . ; Score 
Length 


Probability 


3386343I_t3_53 , b07 


2427 


430 


12 93; 455 


4.2e-43- 


Protein, name 






Locus Name 


. Acc# 


conserved hypothetical prot 


em 




pir:F75546 • 


F75546 


Description 










ORF Name : ' NTID 


AAID 


NT . 
Length 


AA 

— . , Score . 
Length 


Probability' * 


35163902_c2_109. | 508 


|2428 , 


j 527- 


1884 |9b0 | 


■ ja-.9e-103 | 


Protein name 






Locus Name . 


.. ' Acc# 



sp:MSBA_ECOLX 



P27299 



Description • ■ . . .. 

PROBABLE TRANSPORT ATP -BINDING PROTEIN MSB A 



ORF Name 



NTID' AAiD 



35350061 C2 98 



"5u~5~ 



NT 
Length 

£4 " 7 



AA 
Length 
|195 



Score ' Probability 



Protein name 
Description 
NO-HIT . 



Locus Name 



Acc# 



182 



ORF Name 


NT ID AAID . 


NT 
Length 


AA 

. — . , Score 
Length 


Probability . 


3612 8 37 8__1 3_6 7 


blU z4 JU 




375 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name 


NT ID AAID 


NT 
Length- 


AA 

— Score 
Length . 


Probability - 






3912568_cl_92 


| 511. 2431 


525 \ 


1578 . |1470 | 


|1. 5.e-150 


Protein name 








Locus Name 


. Acc# 


soluble pyridine 


nucleotide transnydrogenase 




gp:AF159108 


AF159108 


Description 












Azotobacter vinelandii soluble pyrid 


ine nucleotide transhydrogenase (sthj 


gene, ,' complete cds 










ORF Name 


' : NTID AAID 


NT 
Length 


AA 

: — , Score 
Length 


Probability 


|4iii008__r2_33 . 


| |512 2432 


94 • 


285. |240 | 


|3,2e-20 . 


Protein name. 








Locus Name 


Acc# k ' : " 










sp:C5PAJ>SEAE 


P95459 


Description 












Major COLD. shock PROTEIN CSPA . 


ORF,. Name ' 


NTID . AAID 


NT . 
Length; 


AA 

' — " , - .Score 
Length 


Probability ' •- 


|4500892_cl 91 ' 


| 513 | |2433 


2 91 


875 - 599 . 


2.9e-58- 


Protein name 








Locus Name 


Acc#' 



sp:YDIA__ECOLI 



Description • 

HYPOTHETICAL 31.2 KD PROTEIN IN. £>pSA-AROH IMTERGEMIC REGION. 



P03822 :P46 
,13'7 :P762 03 



183 



ORF Name 



NT-ID 



AAID 



5122667 ri T2 



NT AA 
Length Length 
368 



Score 



E3— — I 



Probability 
6.4e-05 ■ 



Protein name 



Locus Name 



mannosyl trans t erase - 1 ike protein 



:Yt>S251712 



Acc# 
AJ251712 



Description 



Yersinia pseudotuberculosis serotype -o : Id hemH gene (partial). andO- antigen 
gene cluster for ddhD gene, ddhA gene, ddhB gene, ddhCgene, prt gene, wbyH 
gene, wzx gene, wbyl gene, wbyJ gene, wzygene, wbyK gene; gmd gene, fcl gene, 
manC gene, wbyL gene, manBgene and wzz gene. 



ORF Name 


NTID 


AAID 


NT 
. Length. 


— Score 
Length 


.Probability 


5859762_^c2_120 


- 515 


243b-., 


| lib- 


346 .. , 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT ' . . . - - ' 


ORF Name 


NTID 


AAID 


NT 
. Length 


AA 

— , Score 
Length , 


Probability 


58445ui_ci_S2 


• 516 


2456 


|55i 


1656 564 


i.5e-54 " . | 


Protein name . 








Locus Name . 


Acc# . 










- sp:Y653_HAUIN 


P44 02 9 


Description 












HYPOTHETICAL 


PROTEIN. HI 06 53 








ORF Name. 


NTID 


AAID 


. NT 
Length 


AA . " 

— , Score 
Length 


Probability : . 


byy3768_c2_122 


517 


2437 


| |b!4 


. 1542 • p05- | 


|4.6e-24 . | 



Protein name 



. Description 



Locus Name 



sp:0^TA_HAUIN 



Acc# 
P44846 



ORGANIC SOLVENT 


TOLERANCE 


PROTEIN 


H0M0L0G PRECURSOR 






ORF Name ' 


: NTID 


AAID 


, NT - AA 
Length Length 


Score 


•' ' Probability •'■ 


7150051_c3_Ul 


518 


2438 


| 70 213.. ; 


131 


1.2e-08 . • 



Protein name 



Locus Name 



Hypothetical protein APK2143 



pir:B72521: 



Acc# 
B72521 



Description 



184 



ORF . Name 



NTID 



AAID 



NT 



AA 



Length , Length 



Score Probability 



977055 c3 129 


|519 2439 327 - 9U4 • 505 


2 . 7e-48 


Protein name 




Locus Name 


Acc# 






sp : YBJE_ECOLI 


P75826 


Description 








■ HYPOTHETICAL* 34 


4 KD PROTEIN IN POXB-AQPZ" INTERGENIG REGION 




ORF Name 


NT ■ AA 
NTID AAID . — . , — \ , Score 
Length Length 


Probability 


9882793_c3_128 


|520 2440 65 v |198 | |109 | 


|2.5e-06 



Protein name 



Locus Name. 



hypothetical protein APE0666> 



foir:F72654 



Acc# 
F726.54* 



Description . 
ORF Name 



NTID 



AAID 



NT ..AA 
- — — Score 

Length Length - — - — 



23515951 cl 12 



TIT 



12441 



TT5 - 



T7TT 



Probability, 
4. 5e^l4 



Protein name 



Locus Name 



adhesin complex 25K protein .. precursor ■ LecA 
protein ' ■ 



pin JC5327 



Description .. 



ORF : Name 



Acc# 

JC532 7 : PC4 ' 
312 



NTID 



AAID 



24-259425- c3 lb 



2442. 



NT AA .. 
Length Length 

165 



Score Probability 



raw 



2 .4e-15 



Protein ; hame 



Locus Name 



adhesin complex 25K protein precursor I LecA 
protein ■ 



pir: JC5327 



Description 



ORF Name 



; Acc# 

JC5327 :PC4 
312 \- 



NTID 



AAID 



AA 

- — , Score 
Length Length — ^— 



33986343 13 10 



NT 
n 



f2TTTu~ 



Probability 
|2 . 9e-250 ! — 



Protein name 



Locus Name 



oligopeptidepermease 



gp iSPOPPDACA 



Acc# 
X89237 



Description , . 

S. pyogenes DNA lor oppA, oppB, oppC, oppD, opp>F,, and dacA genes. 



■'. 185 



ORF Name 



NT ID AAID 



4727338 T3 9 



12444 



NT 
Length 
325 



AA 

— ; , Score Probability- 
Length — r- : " L 



T7T 



l.Oe-128 



Protein name 



Locus Name 



oligopeptidepermease 



S POP MAC A 



Acc# • 
X89237 



Description . • 
S. pyogenes DNA tor oppA, oppB, oppC, oppD, oppF, ana • dacA genes . 



ORF Name 



NTID AAID 



152 5 



12445 



NT 
Length 



AA 
Length 
[TS2 — 



Score Probability 



Protein name 
Description 



Locus Name 



Acc# 



ORF Name 



'NTID 



AAID 



5053212 El 8 



. NT 
Length 
340 I 



AA 

v r ~ , Score Probability 
Length — ^ — — ^ 



11398 I |S.3e-143 



Protein name 



Locus Name 



oligopeptidepermease 



5 POP TO AC A 



Acc# 
X89237- 



Description : • •• , 

S. pyogenes -DNA tor 'oppA, • oppB, oppC, oppD, oppF, and. dacA genes , 



ORF Name ' 



NTID, AAID 



112255555 c2 101 



5TT 



[2T4"7~ 



NT • 
Length 
221 



AA 



-■ — ■ . Score " Probability 
Length — — -r— + 



Protein name . 



Description 



1655 I 1753 I ll.4e-74 

Locus Name 



lsp:DP3X^HAl!lN 



Acc# t- 
P43746 



DMA POLYMERASE III SUBUNTT GAMMA/ TAU, 


NT 

ORF Name NTID AAID I — 

Length 


AA 

_ — ; Score 
Length 


Probability 


1256i562_c2__10? •• 528" 2,448 214 


645 254 


3.5e-21 


Protein name 1 


-Locus Name 


Acc# 


nemolysm-related protein 


pir:V72326 


, F72326 • 



Description »• 



ORF Name 
|12gQ5253_ci_8b 



NT ID AAID 



NT ■' AA 
Length 'Length " 
— n 12904 



Score Probability 



1.3e-65 



Protein name - 

■j ■ . • . . 

Description , . ' 

(MDftHIN HYDROLASE B] ( REGULATORY PROTEIN DNIfe) 



Locus Name 



sp:MLTD^ECOLI 



Acc# • 

P23931:P32 ' 
982 :P77350 



: ORF Name 


NT 

NTID AAID , „ ■- ; 

Length 


AA- 

— Score 
Length . 


Probability 


12891082_r3_51 


530 2450 237 


714 .234 


l:4e-19 


Protein name 






Locus Name 


Acc#. 








sp:YBHD_ECOLI 


1 P52696 :P75 


Description " 








761 


HYPOTHETICAL TRANSCRIPTIONAL REGULATOR IN MODC- 


BXOA : TNTERGENI C REGION t 


ORF Name 


NT 

1 NTID AAID — 

Length 


AA . o ' ■ t 
. , — . , Score 
■ Length 


Probability . 


i38760i0_ti_ll 


531. ■ 2451 135 


408 155 


7.0e-il . . 


Protein name ' 






Locus Name 


Acc# 








sp : RBCR_CHRVI 


P25544 


Description 










■ RURISCO OPERON 


TRANSCRIPTIONAL REGULATOR 








ORF Name 


NT .. 

NTID .AAID .■- , . 

Length 


AA 

; , Score . 
Length 


•Probability ■ 


15870706_cl_68 • 


.. | 532 |2452 344 


1035 |1009|> 


|l.le-101. ; ; 


Protein name 






Locus Name 


. Acc# 



Description 



sp:LEU2_ECOLl 



P30127.:P78 
042 



(I30PR0PYLMALATE 


ISOMERASE) ( ALPHA - 

o 


IPM ISOMERASE) (IPMI) 




ORF Name 


NTID •■ AAID 


NT • , AA 
■ — , — - , . Score 
Length Length 


Probability 


175062_clJ79 v- 


v | 533 . 2453 


219. 660 . |740 | 


p.4e-73 


Protein name -. . 




Locus Name 


Acc# 



sp:HPPD_PSESP 



P80064 



Description - 
4 - HYDROX YPHEN V LP YRUVATE DI0XYGENA5E, . (4HPPD) (HPD) 



187 



ORF Name 


NT' 

NT ID AAID 

■ - . Length 


"AA 

T : — ^, Score 
Length 


Probability 


|i97690S2_cI_74 


b34 . 2454 131 


396, 294 ■ 


2.0e 


-25 


Protein name 






Locus Name 




Acc# 








sp:SYK_ACICA ' 


Q4 3 990 


Description 












LYSYL-TRNA SYNTHETASE, (LYSINE- -TRNA LIGASE) 


^LYSRSJ 




' 1 


ORF Name 


NT 

;NTID AAID — - , 
Length 


AA 

, Score 
Length 


Probability 


20178438_ci_80 


| 535 .(2455 173 


522 629 


1 . 9e 


-61 


Protein name. 






Locus Name 




Acc# 




















sp:HPPD_PSESP 


P80064 


Description 1 '. , 












4 - HYDROX YPHENYLPYRUV ATE D I OXYGENASE , (4HP£D) 


(HPD) 






ORF - Name . 


. .'. • • NT 
. NTID AAID — : , 
Length 


AA, ' 

, Score 
Length 


-Probability 


2I-7295i3_c3_i2? 


535 2456 61 


186 •■; 






Protein name 






Locus Name \ '' 




- Acc# 


Description 












NO-HIT •• 


ORF Name- . . 


... NT : 
■ NTID AAID : . . — ■ " 
■v; ■ Length 


AA , 
T ~, i Score 
Length 


1 Probability 


21738306_12_30,. ; 


- ; 537; - 2457 . 100 


303 : 185. 


|4. 9e- 


-14' . 


Protein name . 






Locus Name 




Acc# ' 








sp:SECF_HAEIN 


P44590 . 


Description 












PROTEIN -EXPORT 


MEMBRANE PROTEIN 5E0P 










ORF Name 


NT 

. ... NTID AAID — , . 

Length 


aA 

, - — , Score 
Length. 


Probability 


22443750_c3_I28 


' 538 |2458 / 201 • 


606.. 166 


8.1e- 


-12 


Protein: name 






Locus Name 




Acc# . 








sp:YC54_5YNV3 




P74078 



Description 
HYPOTHETICAL 38.3 KD PROTEIN SLL1254 



188 



ORF Name 



NTID AAID 



23572128 cl 92 



NT AA 
Length Length 
103 = 



TTT 



Scoire Probability 
|179 | |6.0e-13 ~ 



Protein name 



Locus Name 



sp:RADA_PSEAE 



Acc# • 
P96963 



Description '" .* ' 

DNA R E PA I R PROT EIN RADA H0M0L0G (DNA REPAIR PROTEIN 5M5 H0M0L0G) 



ORF Name 



23614376 c3 >119 



NTID 
1540 



AAID 



NT AA 

— , — , Score Probability 

Length Length - — ~~ — , — : — — — ; — — jL 



ITT 



Protein name 



Description 



|939 | |742 | |2.1eT73 — r 

Locus Name Acc# 



sp ( :EX3_HAU!U 



P44318 



EX0DE0XYRIB0NUCLUA5E III, .-(EXONUCLEA^E .III) (EXO III) 


ORF Name •'; NTID AAID 


NT 

Length 


AA 

— ~ , Score 
Length 


Probability 


23994182_tl__17. . lf 541" , • 2451 


| 171 


516 175 


2.5e-13 


Protein name 






Locus Name 


Acc# - 


oril . \ • . - 




gp:PAU39558 , 


U3.9558 ' 


Description ' - 








Pseudomonas aeruginosa ortl, TolQ 


(tolQ) , TolR 


(tolR) , TolA • 


(tolA) , and TolB , 


(tolB) genes, complete cds . 










ORF Name NTID AAID 


NT 
Length, 


AA 

_ — . Score 
Length 


Probability 


24276625- c3 122 " 542 2462 


293 


882 337 


l:7e-30 


Protein name 






■Locus Name 


Acc# 








sp:YGIP_ECOLI 


P45463 


Description- 










HYPOTHETICAL 1 ' TRANSCRIPTIONAL REGULATOR IN BACA- 


TTDA; INTERGENIC REGION 


... ORF Name 1 "NTID AAID 


NT :'. 
Lengthy 


AA ' ■ . ,. 
. — . , , Score 
Length' 


Probability 


24406575_cli69 | |543 2463 


2UV | 


|684 | |780 | 


|1.9e-77 


Protein name 






Locus Name , 


Acc#. 








sp:LEUD_A20VI 


P96196 



Description 

(ISOPROPYLMALA TE ISOMERASlj) (ALPHA- 1P M ISOMERASE) 



189 



ORF Name . 


NTID 


.AAID 


NT 
Length 


AA 

• — Score 
Length 


Probability 


244i591i_c!J_li5 ' 


544 


2454 


97 


294 |97 | 


|4.be-0b. 
« 


Protein name 










Locus Name 


Acc# 


outer membrane proi 


bein H.8 


precursor 






pir :S04157 


S 04 157 


uescr lpuion 




' ./ 










ORF Name 


NTID 


AAID 


NT 


AA- 

; Score 
Length : 


Probability 


24417077_c3J.21 


| 545 


2465 




1558' 229 1 




Protein. name 










Locus Name 


Acc# 












sp:DP3X_HAElN 


P43746 


Description 














UNA POLYMERASE 111 


SUBUNIT GAMMA/ TAU, 








ORF Name 


NTID 


AAID . 


NT 
Length 


AA o ... V' 
— , Score 
Length. 


Probability 


25584b0i_c2_110 " 


545 


2455 \ 


230: 


693 ■ 




Protein name 










Locus Name 


Acc#, 


Description 














NO-HIT ., ■ ■ " ■ . 


* *. ■ . 
ORF Name 


NTID ' 


AAID' 


NT 
Length 


AA *■ 
_ — , . • Score 
Length 


Probability 


301$8405_c2_100 ; 


J 547 


| 2457 . 


417 . 


1254 |. 1913 


i.7e-197 , 



Protein name. 



Locus Name 



sp:SYK_ACICA 



Acc# 
.Q43990 



Description . . 
LYSYL-TRNA' SYNTHETASE, (LYSINE- -TRNA LIOASE) (LYSRS) 



ORF Name 



NTID' 



AAID 



134405268 cl 70 



NT ; , 
Length 
,1169 ■■' 



AA 

T Score 
Length - - - 



Protein name 
Description 

IMO-HiT- — 



Locus Name 



Probability 



Acc# 



190 



ORF Name 



NTID AAID 



3912568 c2 105 



NT AA 
Length Length 
T51 1 11404 



Score 



Probability 
1.4e-81' 



Protein name 



Locus Name 



sp:NHAC_bACFI 



Acc# . 
P2761T 



Description 
NA( + )/H( + ) ANT I PORTER 



ORF Name 


NTID AAID 


. NT 
■Length 


AA 

, — . . Score , 
. Length 


Probability 


39b33-V7_c3JL17 


| 550 | |2470 


218' 


557 , ■ 534 . 


3.1e-51 


Protein name 








Locus Name 


Acc# 










sp : LUU2_0ANMA 


Q00464 ' 


Description 












■ ISOMERASS) (ALPHA- 


IPM I SOME RASE) 


UPMI) 








r ORF Name. 


NTID AAID 


• NT .. 
Length 


AA 
Length . 


Probability 


3988813_rl_4 


| 551 |2471 


: 628. 


1887 \ 1391 


3:5e-142 



Protein name ; 



Locus Name 



general, protein secretion pathway subunit 
SecD 1 ' :' -•'' ,* . ■ ' ' 



|gp:AF179925-, 



Acc# , 
AF179925 



Description 



Citrobacter treundu general protein secretion pathway subunit SecDgene, 
complete cds ... 



ORF Name 



NTID AAID 



.. NT . AA 

,- — , ' , — ■ Score 
Length . Length — — — 



Probability 



4314068 c3. 118 



12472 



11080 



|1.8e-138 



Protein name \' . 

Description ,_ ; , 

( IMDH) ('3 - IPM-DH) (FRAGMENT) 



Locus. Name 



sp:LEU3_Nl!lLA 



Acc# 
P50180 



ORF Name 



NTID AAID 



4335328 c2 98 



2473 



NT AA 
Length Length 
[S3 — ~ 



, Score Probability 



Protein name 
Description 



Locus i, Name 



Acc# 



191 



ORF Name NTID AAID 


NT 
Length 


" AA 

. — , Score 
Length 


Probability 


4487638_iI_I , ; • ■■ | 554 |2474 


£13 


1842 |1885| 


|1.2e-l?4 


Protein name 






Locus Name 


Acc# 








sp : PPCK_CHLLI 


Q08262 


Description ' . ; - 










• (PhOSPhOENOLPYRUVATE CARBOXYLASE) (PEPCK) . ' * 


ORF Name ■ NTID AAID 


NT 
Length .: 


AA 
Length 


Probability 


4771925_c2_94 | 55b |2475 


207 


524, , 331 


|7 : 4e-30 ,. 


Protein name . 






Locus Name 


Acc# 








sp : RUVA_P5EAE 


Q51425 


Description 










HOLLIDAY junction, dna helicase ruva 


ORF Name NTID AAID ' 


NT 

Length 


AA ' ■•■ 
— , Score 
Length 


Probability 


4865427_r3_6i . 556 . 2476 


289 


870 . |247 | 


|b.9e-21 . 


Protein name ' 






Locus Name 


. Acc# ■ • 


hypothetical protein • ' . 


pir: 575235V 


-. . • S75235 , 


Description 










ORF Name ': NTID AAID . 


NT 
Length 


. AA 
Length 


Probability 


4881338_ti_5 ~ 557 ■ |2477 ■ 


287. 


|864 , , | |420 | 


|2.7e-39 


Protein name '•' , • 






Locus Name 


Acc# ' 








sp:5ECF_HAEIN 


P44590 


Description 










PROTEIN^EXPORT MEMBRANE PROTEIN/' SECF ; - . V 


ORF Name NTID AAID 


NT 
Length , 


AA - ... 
, — . , Score 
Length • 


Probability 


5084463_t2_28 . 558 • 2478 


114 


345 [240 | 


|3.2e-20 



Protein name 



Locus Name 



sp:YAJC_ECOLI 



Acc# . 
P19677 



. Description : , . - _ 

HYPOTHETICAL 11.9 KD PROTEIN. IN TGT-SECD INTERGENIC REGION (0RE12) 



192 



ORF Name 



5111588 c3 116 



NTID 
Ib59 



AAID 



NT AA 
Length Length 
348 | [TuT7 



Score 



Probability 



Protein name 



I 1455 1 p-^e-150 ~ 
Locus Name Acc# 



tructose-1, 6 -bispnosphate aldolase 



|gp:PST011927 



AJ01192 7 



Descriptiori * ..... 

Pseudomonas stutzeri Ida gene and gene encoding hypotheticalproteirr. 



ORF Name 



1978400 cl 83 



NTID 



AAID 



][ 



NT ■ AA 
Length Length 
|387 



- — •• , .'' Score 



[577T 



Probability 
|4 . 4e-48 



Protein name 



Locus Name 



penic ill in -binding protein 4 



|gp:AF156692 



Acc# 
AFl'56692 



Description 



Neisseria gonorrhoeae penicillin-binding protein 4 (pbp4) gene # complete cds. 


..... , ■.(.. - 1 ■ ■ 

'"' "', NT AA 
ORF Name NTID AAID . -h- - . — ' , 

Length Length 


Score 




Probability 


1053753_t3j53 . bSl . 2481. 588 1767 : 






1.2e-79 



Protein name 



Locus Name 



putative membrane protein 



|gp:AFlb0928 



Acc# 
AF150 92 8 



Description 



Acmetobacter sp . ADP1 BenP (benP) and AreR ,(are;R) genes, cpmpletecds; are 
operon, complete sequence; SalD (salD) -, and ■ SalE (salE) genes , ; complete cds; 
SalR (salR) , SalA, (salA) , putative membraneprotein, putative 2 -component '" 
regulatory protein, putativehistidine kinase of 2 - component regulatory- . 
system, and carbonicanhydrase homolog genes, complete cds; and, 



ORF Name^ 


NTID AAID 


NT 
Length 


AA 

• — " . Score ... 
Length . 


Probability 


1058425_c2_108 


b62 | 2482 


80 


243 |310 | 


|1..2e-27 


Protein name 






Locus Name 


Ace# ' .. 


ribosomal protein 


£518 




pir:E64076 


E64076 



Descriptiori 



193 



ORF Name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score , Probability 



±203450Jll_5 563 2483 


530' 


1593 




1297 | 


13 .2e 


-132 


Protein name 


' ■ i ' 




Locus Name 




Acc# 








sp:YB2X_HAEIN 1 , 


086233 


Description 


j, 














HYPOTHETICAL? PROTEIN HI1126.1, 


ORF Name NT ID AAID 


NT 
Length . 


AA 
Length 


Score 


Probability 


12271926_cl_71 | 564 , | 2484 


| 143 


432 




197 | 


1.2e 


-15 


Protein name 






Locus Name 




Acc# 








sp :YFFB 


_HAEIN 




P44515 


Description • 
















HYPOTHETICAL PROTEIN HI0103 


ORF Name NTID AAID 


NT AA 
Length ' Length 


Score 


Probability 


15635930_t3_6i . . 565 2485 


373 


v 1 


122 




842 


5.2e 


-84 


Protein name 






Locus Name 




. Acc# 








sp:QUEA^ECOLI 


. P21516 i, 


Description 
















(QUEUOSINE BIOSYNTHESIS PROTEIN QUEA) • , 


ORF Name NTID, ■ AAID 


■• , NT 1 
: Length 


AA 
Length 


Score 


Probability. 


16i34657_t2_24 , 56 : 6 , :. 2486 


1 l J6V 


1104. 






1.3e 


-89 


Protein name 






Locus Name 




Acc# ... 








sp:GCST_ECQLI. . • 


P27248 . 


Description : "\ ;\ 
















PROTEIN) •• . • • -- lf .. :*. 


ORF Name . , NTID AAID 


NT 
Length 


AA ■ ; 
Length 


Score 


Probability 


1972ii_ci_9i 567. 2487 


240 


7 


23 




304 


5.4e. 


-27 


Protein name ' r 






"Locus Name 




Acc# 


hypothetical protein ■ : .. 




gp : ACRBDOXM ; 




Z46863 ; 



Description 



Acmetobacter sp . cysD, cobQ, sodM, lysS, rubA, rubB, estB, oxyR,ppJc, mtgA, 
0RF2 and 0RF3 genes. 



194 



ORF Name •. 


. NTID 


AAID 


NT 
Length 


AA 

r Score 
Length , . 


Probability 


20890660_t3_56 


- 568 


2488 


121 


366 




Protein name' 








Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


.NT 
Length 


AA 
Length 


Probability 


203i5682_li_i ; 


[559 


2483 


142 


423 ' 162 


6.0e-12 


Protein name 








Locus Name 


Acc# 



' Description 



sprYIBNJ^OLl 



P37688 



HYPOTHETICAL 


15.6- KD PROTEIN IN SECB 


-TOH INTERGENIC REGION 




ORF Name 


. NTID AAID 


NT 
Length . 


. AA 
— . , Score 
Length 


Probability 


2I7S9fi56_tt.fl 


570 2430 


531 


|i776 | |630 | 


|l.<3e-i03 


Protein name 






Locus Name 


; Acc# 


Na 1 + solute 


symporter (Sst ramily) 




pir :E70480 


E70480 


Description 










ORF Name 


NTID AAID 


. NT 
Length 


AA 

, . Score 
Length 


- Probability 


|2235338u_H_7 


571 2491. 


83 


'. 270 . 





Protein name 



Description t 



.Locus Name 



Acc# 



NO-HIT . . - ■ . 1 


. ORF Name ' 


NTID . ' AAID • . 


NT • 
Length 


AA 

— Score 
Length 


Probability 


234!4U426_t2_2b 


| 572 2432 


351 


258S |2S73 | 


p.le-299 ; 


Protein name 




■ 


" Locus Name 


Acc# 



sp:GCSPJ^OLl 



P33195 



Description ■ •■ 
DECARBOXYLASE) (GLYCINE CLEAVAGE SYSTEM P-PROTEIN) 



195 



ORF Name 



23444bUi 12 2b 



Protein name. 



NTID 



AAID 



NT 
Length 
138' 



AA o 
t Score 
Length 1 — - — — 



glycine cleavage system protein " _— 
H.-aminomethyl carrier protein Vgiycine 
decarboxylase complex protein H 



Description 



Locus Name 



bi-r:A566-23- 



Probability 
2.be-38 



Acc# 

A56623 :S3 6 
833 :B56689 
•:I41231 :H6 



ORF Name 


NTID 


AAID 


NT 
Length 


' AA 
Length 


Score 


Probability 


24391340_r3_47 


p4 


2494 


82 • 


249 - 








Protein name 








Locus Name 




Acc# 


Description 
















NO-HIT ■ " ... . 




ORF Name , 


■, NTID 


AAID , . 


NT 
Length 


AA 
Length 


: Score 


Probability 


24397200_r2_32 


575 


2495 


410 


1233 


p30 


|1.0e 


-135 


Protein name 








Locus Name 




Acc# 










sp:TGT_MAEIN 




P44594 


Description 
















TRAMSGLVCOS YLASE ) 


(GUANINE 


INSERTION ENZYME ) 












ORF Name 


NTID 


AAID . 


NT 
Length 


AA 
Length 


■Score 


Probability 














^>54002^_c3 117. . 




2496 . 


399 


.: ; 1200 ' 


. 725. 


l t 3e 


-71 . 


Protein name 




.1. 




. Locus Name 




. Acc# 










. sp ■ YCAB_PSEFR . 


P72190 


Description 












" . - r 

* I'- 




HYPOTHETICAL 30.2 


KD PROTEIN IN CAPB 


3 1 REGION 










ORF Name < 


NTID' 


AAID 


NT ' AA 
Length Length 


Score . 


Probability 


25562762_12_18 


|577 


2497 


109 


330 . 


199 


7:2e- 


-16 


Protein name 






7 . .• 


Locus Name 




Acc# • 


glutaredoxm 3 (grxCl) RP2 04 


j. - 


. pir:P71V31 


F71731 



Description 



196 



ORF Name 



• NTID ' 



AAID 



256264b2 c> VAb 



1T7TT 



NT 

in 



AA „ • 

— Score 
Length Length — — — - 



321 




129 





Probability 
1. 9e-08 



Protein name 



Description 



Loeus Name 



spiYCGLJ^OLI 



Acc# 
P76003 



HYPOTHETICAL. 12.4 KD PROTEIN IN M1NC-SHEA INTERGENIC H^ION 


ORF Name NTID 


AAID : 


NT 
Length 


AA 

— , Score 
Length 


Probability 


282BB0_c^_110 579 


|2499 


316 


951 127 


|2,3e-05 


Protein name 






Locus Name 


Acc# 


hypothetical protein 






gp: SFR236923 


AJ23692.3 


Description 










Shewanella trigidimarina 


itcA gene 


. and ORF2 .(partial) and ORF1 . 


. ORF Name 1 NTID 


AAID' 


NT ' 
Length 


AA • 11 
- — — . Score 
Length 


Probability 


29314057_cl_90 y 580 


2500 


289 


870 |705 | 


± . 76-6 9" ' 


Protein name 






Locus .Name - 


' ACC# 


probable ion transporter 


pir:E7 547 0 


V E7 547 0 , 


Description 










ORF Name NTID 


'" ' AAID 


NT 
Length 


AA 0 
- — , Score 
Length 


Probability ■'; ' 


29^ii4b9_r2_3'9 581 


. |2501 • 


143 | 


43.2 .196 . 


■|2.7e-14- ; 


Protein name 1 






Locus Name 


Acc# .. 








. sp :-SYL_SYNY3 


P73274: . 


Description 










LETICYL-TRMA SYNTHETASE, 


(LEUCINE- - 


TRNA LIGASE ) ( LEURS ) 




ORF Name ' NTID 


AAID 


NT' 
Length 


AA 

— , ■ Score-. 
Length 


Probability 


3O10O432_t3_« 582. 


2502 


361 | 


1086 . 128 


2:3e-05 ' 


Protein name ., 






Locus Name 


. \ . Acc# 








sp : HOLA_ECOLI 


\ ?28630 


Description 










DNA 'POLYMERASE III, DELTA SUBUNIT,. | 



197 



ORF Name 


NTID 


AAID 


Ml , 

Length 


AA . ' - 
„ — . , Score 
Length 


Probability 


32982b7_tl_16 


583 


2503 


|178 


537 






Protein name 










Locus Name 


■j 


Acc# 


Description 
















NO-HIT ' • \ '.. ■ • . . • • 




ORF Name 


. NTID 


AAID 


. NT 
Length 


AA 

„ — . ; Score 
Length . 


Probability 




584- 


2504 


155 


458 ' 146 


' 3.0e-10 , 


Protein name 










Locus' Name 




. Acc# ... 


unknown ■ 




gp:AF064527 




AF064 52 7 


Description ", 


< 














Rhodocista centenaria PPH 


(pph) gene, complete 


cds; and unknowngenes . 




, ORF Name ' 


NT ID 


AAID 


NT . v . 
Length 


AA 

- — , Score 
Length 


Probability 

r " ■ * • 


3S07781_ ; c3_i2i5 


585 


2505 


170 | 


■513 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT \ . ; " ' ' . " - - • 




ORF Name 


NTID 


■ AAID 


'■■ . nt 

•: ; Length 


AA 

— Score 
Length 


Probability 


3925443_c2_107 


b86 ,: r. 


2506 


136 ; • 


' 411 ■ 369 ; 


:6.9e 


-34 , 


Protein name , 










Locus Name 




Acc# 












sp :RS6JU00LT 




P02358 , 


Description 














i ■ 


3 OS RIBOSOMAL 


PROTEIN., SS 
















ORF Name . 


NTID ' 


' AAID . 


; ' NT 
Length 


AA . _ 
— Score 
Length 


Probability : 


4003558_i:r_,2 • 




2507 


|lb,7. | 


474 1508 1 


1.3e 


-48 


Protein name 










Locus Name 




■ Acc# 












sp:DOT_ECOLr 




P06968 



Description 



(DUTPASE) (DUTP PYROPHOSPHATASE ) 



.198 



ORF Name 



NTID 



AAID 



NT '' '. AA ■ 
Length Length 
151 " 



Score 



Probability 
|2.0e-36 



Protein name 

Description ■ 
PROTEIN- EXPORT MOTE IN - SECB 



Locus Name 



sp:3ECB_EC0LI 



Acc# .... 
P15040 



ORF Name 


NTID 


AAID 


NT 
Length i 


AA 

y — t , . Score . 
Length 


Probability 


4860762_tj_S4 


589 . 


|2509 


268 ;| 


|B0-7 | 




Protein ■ name 








Locus Name 


Acc# 


Description 












NO- HIT . \ \ • '•: ■■ 


ORF Name •• • 


. -i NTID 


AAID 


NT 
Length 


AA 

- — , Score 
Length : 


Probability 


|48<d0943_c1_89 




|2510 


185 | 


558 199 


|7.2e-16 



Protein name 



Locus Name 



NADPH : quinone oxidoreductase 



,|gp:AF14S234 



Acctt 
AF14 5234 



Description "... : ;! '. -v ' -"' . 

Arabidopsis thaliana NADPH : quinone oxidoreductase (NQR) mRNA, complete , cds . 



ORF Name 



NTID AATD 



14897050 13 44 



^5TT 



NT AA 
Length Length 
153 



Score , Probability . 
|355 | |2 . le-32 



Protein name 



Locus Name 



acetylglutamate Kinase 



pir :D7 04-77 



ACC# 
070477. 



Description 
ORF Name 



NTID AAID 



50160 cl 7b 



NT AA 
Length Length - 
TT7 



Score Probability 
|1.9e-42 - 



Protem name 



Locus Name 



haemoglobin- haptoglobin binding protein HhuA 



gp:HIU43198, 



' ;Acc# ■ 
U43.198- 



Description 



Haemophilus - intluenzae haemoglobm-haptoglobin binding protein HhuA(hhuA) 
gene, complete cds. 



199 



ORF Name 


NT ID 


AAID 


NT 
■ Length 


AA 

t — Score 
Length 


Probability 


g A *7 i ft — FT" A'fi — " 

bODU / lo J- Z ft U 


rm 




ft? 




249 




Protein name 










Locus Name 


Acc# 


Description 














NO-HIT =- 


ORF Name 


NTID 


AAID 


NT. . 
Length 


AA 

T Score 
Length 


Probability . 


6822152_c3_128 


594 


2514 


160 




483 |438 | 


|3.4e-41 


Protein name 










Locus Name 


Acc# 



Description 



sp:RL9_ECOLI 



P02418 



50S rIBuSOMaL 


PROTEIN. L9 










ORF Name 


NTID " 


AAID 


NT 
Length ' 


AA 

— score 
Length 


Probability 


783426^ti^i5 


595 


■ 2515 


766 - 


. 2301 |2022 | 


|4.8e-2 09 


Protein name 








Locus Name 


. Acc# 



sp r.SYL^ECOLI 



Description ■ 
LEUCYL-TftNA SYNTHETASE/ (LEUCINE- -TRNA LIGASE) . (LEURS) 



P07813 : P78 
292 :P77110 



ORF Name 


. NTID 


■ AAID 


NT 
Length 


AA ' 
„- — , Score 
Length 


Probability 


860300_t2_2? 




2516: 


85" 


. 258 , 




Protein name 








• Locus Name 


Acc# . 


Description ' 












NO -HIT 


'• ORF *. Name 


ntid; ' 


AAID 


NT 

. Length' 


AA 

, — , , Score 
Length 


Probability 


9803127_cl_88 


: 1 597 


2517 


321 


955 : |53 | 


|0 . 041 | 



Protein^ name 



• Locus Name 



hypothetical, protein (bpi 3 ' region) 



pir:C37397 



Acc# 
C37397 



Description 



200 



ORF Name 



NT ID 



AAID 



9898513 c2 101 



NT . AA 
Length Length , 
527 | 1 1584 



Score Probability 



jfeOO | |l : 3e-88 



Protein name 



Locus Name 



sp:YE57_HAEiN 



Description ' " " 

'RROfeAfeLE TOWS- DEPENDENT kECEPTOR Hi 1 Id 6 7 PRECURSOR 



Acc# 

Q5 74 08 :P96 
344 



ORF Name 



NTID 



AAID 



1040887 cl 72 



NT 
n 
TOT 



• AA 

T 4_V, T _ 4.U " SCQre 

Length Length 



Probability 
|1.8e-58 ~ 



Protein name 



Description 



Locus Name 



gp:AB025342 



Acc# 
AB025342 



Moritella marina genes, , 
synthesis gene cluster. 


complete 


eds, similar to eicosapentaenoicacid 


ORF Name 


NTID 


. AAID 


NT 
Length 


' AA 

• ■ — ; , Score 

Length • » ■. • 


Probability 


10S48402^t3_46 


| 500 


2520 


395 | 


1188. 953 


9.0e 


-95 


Protein name 










Locus .Name 




Acc# 












sp:AR0F_EC0LI 




P00888 


Description 
















SYNTHETASE) ■ (3 


- DEOXY - D - ARAB INO - HEPTULOSONATE 


7- 


PHOSPHATE SYNTHASE) 




ORF Name . 


NTID 


' AAID 


NT 
Length 


AA 

. — , , Score 
Length ... 


Probability. 


10723543_13_48 


601 


! |2521 


73 


222 






Protein name 










Locus Name 




Acc# 


Description 
















NO -HIT, " • .• ■ ' ' ■ 


ORF Name 


NTID 


r AAID 


NT "■" 
Length 


= AA 

— , Score 
Length 


Probability 


10969087_13_S0 


| 602 


. |2522 


1242 


3729 1 2415 


3.1e 




Protein name 










' Locus Name 




Acc# 


DNA polymerase 


III 








gp:AF062919 


' AF062919 



Description , - - 

Pseudomonas- rluorescens DNA polymerase III. (dnaE) gene, completecds . 



201 



ORF Name 



NT ID AAID 



119552252 c3 122 



fot - 



NT 
Length 
111 



AA 
Length 



Score 



2T5~ 



Probability 
5.3e-20 " 



Protein name 



Description 



Locus Name . 



gp:t>90853 



Acc# , 

D90863 :ABO 
0134 0 ' 



E.coli genomic DNA, 


Kohara clone 

r 


#407(52.4- 


52 . 8 mm . ) . 




ORF Name 


NTID AAID 


• NT ■ . 
Length 


AA 

, — / Score 
Length 


Probability , , 


|20180387_c2_104. 


504 2524, 


208 


527 117 


l.Oe-05 


Protein name , 






• Locus Name . 


: • . Acc# 








sp:Y355_HAETN 


P43988 


Description 










HYPOTHETICAL PROTEIN HI03bb PRECURSOR 


ORF Name 


NTID AAID 


NT 
Length- 


AA 

_ — • ■ Score. 
Length ... 


Probability 


|20355003_i:2_25 


605 | 2525 


173^ , 


522 




Protein, name ' 






Locus Name 


' . . Acc# 


Description 










NO-HIT ; • • \ , , • 4 1 • ■ . . ••• ; ' 


t 'ORF Name 


NTID AAID 


NT ! : 

Length 


AA 

■ Score 
Length 


Probability. 


2048550i_c3_ii5 


505 . | 2525; 


154 , 


455 ;. 239 • 


4..1e-20 \ | 


Protein name 






Locus Name 


. Acc# 


hypothetical protein PH0336 




. pir :E71140 


E71140, , 


Description 


. i 








ORF Name 


NTID AAID ; 


NT 
' Length 


AA 

„ — . , Score . 
Length 


Probability 


|2120263_c3_114 j 


507 2527 


' 200 


603 217 


8y8e-18 • | 



Protein name 
Description ■ 

r (P28S) ' 



Locus Name, 



sp:YGGB_ECOLI 



. Acc# 
P11666 



2 02 



ORF Name 



NTID 



AAID 



2213192b ci' 124 



"6W 



NT AA 
Length • Length 
TFT "I '11154 



Score Probability 
[1175 | p .ie-119 ~ 



Protein name . 



Locus Name 



AarC 



gp773TJ5T3TT 



Acc# 
U67933 



Description 

Providencia stuartii AarC (aarC) gene, complete cds . 



ORF , Name 
[24219200; 13 59 



NTID 



AAID 



NT 
Length 
413 



AA 
Length 
11242 



Score 



|872 | 



Protein name 

Description . ' 

HYPOTHETICAL, PROTEIM HI0396 



Locus Name 



Probability 
|3.be-B7 ~ 
Acc# 



sp:YCFD_HAEIN 



P44683 



GRF Name 



NTID AAID 



24412575 12. .27 



12530 



NT 
Length 
2^3 ■ 



AA 
Length 
1762 



Score 



T5T 



Probability 
1.3e-25 



Protein ( name 



Locus Name ' 



|sp':RMH2_VIBCH 



Acc# 
P52021- 



Description ' ,. • 

RIRONUCLEASE HIT, (RNASE HTTJ ■ (FRAGMENT) 



ORF Name' ■'!■ 


NTID 


AAID 


NT 
Length 


AA score • 
Length 


Probability 


2442325u_i3_53 


| 611 


(2531. 


, 67 


204 ' 




Protein name 








Locus Name 


' ' Acc# : * • 


Description' ' 












NO -HIT 


■'-■>..' v . 

ORF Name . 


NTID 


AAID 


, NT 
. Length 


AA 

.,' — . , Score 
Length 


Probability. 


2bS73U02_tl_4 


612 


| 2532 


446 


1341 456 


4.2e-43 ; : 



Protein name 



Locus Name 



IrpicL-A-disaccharicle synthase, 



pir :E6418 0 



Acc# 
E64180 



Description 



203 



ORF Name 



NTID AAID 



29407800-cl 80" 



FIT" 



NT AA 
Length Length 
403 * I 11212 



Score Probability 



5.4e:-114 



Protein name 



Locus Name 



Acc# 



Sp : YFGB_PSEAE ' . 



Description 




*' ■ 






Q51385.-Q51 

525 . 


HYPOTHETICAL 41: 


7 KD PROTEIN , IN PILF-NDK TNTERGENlC 


REGION 


CORFU 


ORF Name . 


NT 

NTID AAID V. — 

Length 


AA 
Length . 


Score 


Probability 


3145438_cij39 


1614 2534 485 


1458 


644 


. l.ie-66 


Protein name 






Locus Name 


Acc# 


unknown .,' 


gp:AE003741 


AF003741' 


Description 1 . ' 




Eschericnxa coli 


CFT07 ; 3 pathogenicity island 


gene, complete 


cds . 1 


ORF Name 


NT 

NTID AAID- — ' 
Length 


' AA 
Length 


Score. 


Probability 


33882816_cl_79 *\. 


-;• | 615- 2535 273 . 


|822 ■ 


243 


1.2e-41 


Protein name '■. 






Locus Name 1 


: A'cc#. 



Description 



sp: YFCB_ECOLI 



P39I99, :P78 
.252 :P76939 



(EC 2.1. ,1 


72) 














. ORF Name 




NTID 


AAID J 


NT ' 
Length 


AA 
Length 


Score- 


Probability 


3906293_il_ 


3 


|616 


2536 


110 ; 


333' 


146: 


3., Oe-10. ' • • 



Protein .name 

Description . . 
HYPOTHETICAL 21. 



Locus . Name- 



sp : VUAL_ECOLI 



Acc# 
P76053 



5 KD PROTEIN IN OGT-DUPA 1MTERGENIC REGION 



ORF Name 



NTID AAID 



3907568 C2 105 



2537 



i NT AA 
Length . Length 
395 I 11188 



Score Probability 
p 56 | |l.le-35 ~ 



Protein name 

Description «' 
HYPOTHETICAL 41. 



Locus Name 



sp:YFUL_ECOLI 



Acc# 
P77774 



9 KD PROTEIN IN XSEA-HISS 1NTERGENIC REGION 



2 04 



ORF Name 



Protein name 



Description 



NTID AAID 



NT AA 

t ~"^u t — ^ Score 
Length Length 



WW 



11077 



Locus Name 



sp : SYH_ECQLI 



Probability 
[6 .6e-109 ~~ 

Acc# 
P04804' 



(HISRS.j 



ORF Name ' NTID 


AAID . 


NT AA n 

T , ^ t — ■ v Score 

Length' Length 


Probability 


3945293_c3_113 , |619 | | 


2539 


328 1 


F 87 | ■ 


696 


1.5e- 


-68 


Protein name 








Locus Name 




Acc# 










sp : SOHB 


_HAEIN 




P45315 


Description , 
















POSSIBLE PROTEASE sohb, . 


ORF Name NTID 


AAID ■ 


NT • AA 
: — , - — , Score 
Length Length 


Probability 


3946892_c3_126 620 


2540 


280 . '■■ 


84 3 / 


164 


6 . 3e 


- 12 ' 1 


Protein name 








Locus Name 




:ACC# 










sp:Y370 


_HAEIN 




P43989 


Description 
















HYPOTHETICAL PROTEIN HI 03 70 


ORF Name f NTID 


AAID 


, i NT 
Length 


AA o 
— Score 
Length 


Probability, 


3945917 ±3 64, - 621 


2541 


205 =• 


618/;' 


b08 | 


|1.3e 


-48' . ■ ' | 


Protein name - " 








Locus Name 




Acc# 








i 


/sp:3MGA_HAEiN 




P44321 


Description 1 
















GLYCOSIDASE) (TAG) 


ORF Name NTID 


AAID . 


•NT 
Length 


AA • 

: — ■ ■ Score 
Length. 


Probability 


4148383_clji3 622 




2542 ' 


■321 


966 


557 


8 . 3e 


-54- . | 


Protein name 








Locus Name ?. 




Acc# . 


hypothetical protein HP085Z 


pir :D646Z6 




D64626 ' 



Description 



205 



ORF Name 



NTID 



' AAJD 



4181557 cl 88 



• NT . AA 

Length Length 
473- | 11422 



Score 



952 



Probability ' 
|2.2e-144 . — — 



Protein name 



Locus Name 



Acc# 
P77254 



Description 

HYPOTHETICAL GTP- BINDING PROTEIN IN XSEA-HISS 1N TE RG E NIC kllGION 



ORF Name . 



NTID 



AAID 



14460938 ±2 37 



NT • 
n 



AA - 
Length . Length - — : 



Probability 
3.8e-S5 



iProtein name 



Locus Name 



O-acetylserine synthase 



|gp:AF010139 



Acc# 1 
AF01013 9 



Description 



Azotobacter vinelandu iron-suliur cluster assembly gene cluster ,suhB , 
cysE2, iscS, iscU> iscA, ' hscB, hscA arid fdx genes completecds; ndk gene; 
partial cds . ' '■■ 



ORF Name 
5114700 cl 85 



NTID 
525 



AAID 



2545 



• NT .AA 
Length Length 
77 . ' "l 



Score r Probability 



Protein . name 



Description 



Locus , Name 



Acc# 



NO-HIT fi . .. .- ; . • " • . . . . 


ORF Name ' NTID- , • AAID 


NT .. AA 
~ — — ; Score 
.Length Length 


. Probability 


52138_cl_91 626- 2545 


109, . 1 330 199 - 


1.2e-15 • 


Protein name 


-.1 


Locus Name 


Acc# . 


solanesyl diphosphate . synthase ' 


gp:AB001997 


. AB001997 


Description - - r 




Rhqdobacter capsuiatus DNA tor solanesyl diphosphate synthase /complete cds. 


ORF Name . NTID AAID 


NT AA : 
• — ~ : — r : Score 
Length Length . 


Probability 


6140580_i2_3S ■ 527 , 2547 - 


300 : | |903 | |187 | 


|1.9e-29 ■;■ : 


Protein .name 


. i- 


Locus Name 


Acc# 


hypothetical : f protein Jp2532 1 f V- 


."" ■ pir:C M 65030 


. : | C6503 0 



Description 



206 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — . , Score 
Length 


Probability 


648425_Cl'_90 


628 


2548 


87 


264 




Protein name 










Locus Name 


• Acc# 


Description 














NO- HIT 






• 








ORF Name 


NTID 


AAID 


NT 


AA 
Length 


. Probability 


10378_c2_184.. 




2549 


... 85 


258 251 


2.2e-21 


Protein, name , . 










Locus '^Name 


Acc# 


cold shock, protein; 








gp: VCCSPA ■ 


Y11908 


Description 


V.cholerae cspA gene. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA • : ' • 
, — , Score 
Length 


Probability 


1063510 c2 198 . 


630 


2550 


175 . | 


528 |208 


| |1.5e-15 


Protein name 










Locus Name 


' Acc# 


uridylyl transterase : 








gp:AB02460i 


AB024601: 


Description '. , 


Pseudomqnas aeruginosa dapD gene tor 
tetrahydrodipicolinateN-succinyletransf erase, 


complete cds , 


strain PAOl . 


- ORF Name 


NTID 


., AAID 


NT 
Length 


AA 

- ~*\ , . Score 
Length 


Probability . 


1175012_ci_174 - 


•631' 1 


2551 


[411 


1236 929 




Protein name 










Locus Name 


Acc# 


acetate kinase ■ 


pir<:B75254 . 


. B75254 


Description ' '. ,. ' • 














ORF Name 


NTID 


AAID 


NT 

Length 


AA 

T — ^ Score 
Length 


, Probability 


il988812_cl_171 | 


532 


| (2552 


J 161 


486 ■ •■ : 




Protein name 










Locus Name 


Acc# 


Description . 















INO-HIT 



ORF Name . : 


NTID 


AAID 


NT 
• Length 


AA 

, — . , Score 
Length ■■ . 


Probability 


12532552_c!i_232 


63i • 


2553 - 


|299 


900 




Protein name 








Locus Name 


" Acc# 


Description 












NO -HIT . / Y 1 ." ■ 


ORF- Name 


NTID 


: AAID , 


NT 
Length 


AA 

_ — \ , Score 
Length 


Probability 


12542005J:3_i30 




25.54- 


52 


279 - 




Protein name 








Locus :Name 


Acc# 


Description ' 












NO-HIT ;-, . • 


ORF Name 


NTID 


AAID 


■.' NT 
Length' 


AA . 
T — . ,.. Score 
Length 


Probability 1 


1300i052_ti_50 • 




| 25i>5 , 


62 | 


I 189 1 : .. : 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT, " • ■. ... • ..• , 


ORF Name 1 


- NTID; 


AAID 


NT 
Length 


AA ' 
. ' — , Score 
•Length 


Probability 


|13852337_c2_202- 


. 636 


p 5 b 6 


351 | 


1055 | |515 ■ | 


|2.Te-49* 



Protein name 



Description 



Locus Name 



sp : A3?BE_HAEIN 



Acc# 
P44550 



THIAMINE BIOSYNTHESIS LIPOPROTEIN APBE PRECURSOR 


NT AA 

ORF Name ■ . NTID AAID — " ( ' — ' Score' 

Length Length. 


/Probability 


14225300_c3 224 637 2557: 123 • 372 


440 


|2.1e-41 • 



Protein name 



Pll-protem 



Description 



.Locus Name 



gp:AVU91902 



. Acc# 
U91902 



Azotobacter ymelandii PI I -protein (glnB) and metnylammoniumtransport 
protein (amtB) genes, complete cds . ' 



208 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

, — , , Score 
Length . 


Probability 


15628127_t3_98 


638 




2558- ' 


7.2 


219 




Protein name 










Locus Name 


• Acc# 


Description 














NO -HIT " : \\ 


ORF Name 


NTID 


AAID 


NT 
Length 


AA ' 
T ~". t Score 
Length 


Probability 




639 






152 


489 [270. | 


|2.0e-2b 



Protein name 



Description 



Locus Name 



sp:UP04JftL'OLl 



Acc# : 

P39169:P76 
624?P77022 
,:P77023 



UNKNOWN PROTEIN PROM 2D-W^1! (SPOT LM6J 



ORF Name . 


NTID 


AAID 


■ NT 
Length 


AA 

\ — -. , Score 
Length 


Probability 


16597790_t'3_131 


,640, . 


■ 256,0 


353 , 


1062 | 232 


2.6e-19' • | 


Protein name 








Locus Name 


Acc# 










sp:NUCl_CUNUt! 


P81203 ■ 


Description ' 












NUCLEASE Cly 












ORF Name 


NTID ' 


AAID 


,NT ., , 
> Length 


— . , Score . 
Length • • — , . 


Probability 


i6603411 L 12_85 


641 


25.61. 


319 


960 . |597 | 


|4.8e-58 



Protein name - , ■ 1 ■ ' ■ / 

Description , , • 

PU T A T IV E 2 -HVDROXYACID DL ' HVDROG E NA^ HIlbb6 



Locus Name 



sp:VP56^HAi:iN 



Acc# 
P45250 



ORF ,Na«\e 



NTID 



AAID 



' — ' Score Probability 



17036428 £3 138 



... NT 
Length 



AA 
Length 



Protein name 



Description 



Locus Name 



Acc# 



[NO-HIT 



209 



ORF Name 



NTID 



AAID 



19b644b« Y7 bb 



NT 
Length 
212 



AA 

■ "~ , . Score. 
Length 



Probability 
|4.0e-15 ■ 



Protein name 



Locus Name 



probable glpG protein 



pir:D712bS 



Description 



ORF Name 



Acc# 
D71258' 



NTID 



AAID " 



][ 



NT 
Length 
610 



AA 



T — , -i Score Probability 
Length — - ' 



1 . 7e-53 



Protein name 



Locus Name 



hypothetical protein 



|pir:S75944 



Acc# 
S75944 



Description 



ORF Name ■ 



NTID AAID 



NT • 
Length 



22035932 c2 209 



AA 
. Length 
112 51 



Score Probability 



"5TF" 



2.3e-49 



Protein name ; 



Locus Name 



B1306 . 06c protein 



|gp:.MLB1306 



Acc# 
Y138.03 



■ Description ; 



Mycobacterium leprae cosmid B13 06. DNA. 



ORF Name 



NTID • AAID 



22775251 cl 144 




AA 
Length 
11632 



Score Probability 
11799 I |2.0e-iSb 



Protein name 



acetolactate synthase, III large ' ~" 
chain : acetohydroxy-acid synthase III large 
chain 



• Locus Name 
pir:YCEC3I 



Description 
• ORF Name 



Acc# , • 

E64729:S14 . 
385:546590 
:A011I3 :I4 



NTID AAID. 



22890836 c2 214 



NT 
Length 
179 



AA 
Length 



Score " Probability 



l.tte-41 



Protein name. 



Description 



Locus Name 



|gp:AHU55832 



Acc# 
U568 32 



Aeromonas hyarophila FK506 binding protein 
kb fragment . 



ItKpA) gene, completecds in 3 . 9 



210 



ORF Name 



NTID AAID 



NT 



AA 



Length Length 



Score Probability 



23595787_c3_220 


548 


2568 


150 


453 




Protein name 








. Locus Name 


Acc# 


Description 












NO-HIT 


ORF Name . 


NT-ID 


■ AAID 


1 NT . 
Length 


AA ' 
, — a , Score 
Length . 


Probability 


237i8750J:2_93 






. 202 . 


503 f 432 


-i.5e-.40 


Protein name 








Locus Name 


Acc# 



sp.:RUV<L\HAEIN 



P44633 



Description - . ' r 

lUnCTTUR NUCLEASE. RUVG). (HOLLIDAY JUCTION RESOLVASE .fcUVC) 



ORF Name 



NTID 



AAID 



123727181 c2. 215 



[5"57T 



NT • AA 
Length Length 
330 



Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



MO-HIT • - v •; • • / • • •• - ; 


ORF Name ' 


NTID ' 


AAID ''■ 


NT . 
Length 


AA 
Length 


Score 


- . Probability " 


239330u8_cl_1.51 


551 


2571 


123 


: 372 ' 


141 


■ 1.0e-03 ,-• 



Protein name 



Locus Name 



hypothetical protein 



pir :T1051I 



Acc# 
T10511 



Description 



ORF Name 



23342557 c2 212 



Protein name 
Description 
[NO-HIT ; 



NTID 



652 



AAID 



NT AA 
Length Length 
^ " 



- -, Score Probability. . 



rzur 



Locus Name 



Acc# 



211 



ORF Name 



NTID 



AAID 



2396332b t3 121 



^7T 



NT AA 
Length Length 
403 I. 112.12 



Score Probability 
1873 I 16 .^-88 T 



Protein name 



Description 



Locus Name 



sp : EADHJ^COLI 



Acc# 
P42 5 93 



A REDUCTASE) 


ORF. Name 


NTID .. 


AAID 


\ NT 
Length 


AA ' 
Length 


Score , 


Probability 


24353427_c2_200 


654 


2574 


b46 


I 1641 


[2074 | 


ll.5e-.214 



Protein ..name. 



Description 



Locus Name 



sp:CHS0_YEREN 



Acc# 
P48219 



■ 6 O.J ( CROSS - 


-REACTING 


PROTEIN ANTIGEN), 










. ORF Name 




NTID AAID- ' 


NT 
Length 


AA 
• Length 


Score 


Probability 


2461276 1_C3_ 


246 • 


655 - 2575 - 


259 . 


780 


I 467 1 


|2.9e-44, 



Protein name - 
Description 
HYPOTHETICAL - PROTEIN HI0S94 



Locus Name' 



sp:YMFC_HAEIN 



, Acc# 
P44827 



• ORF Name . 
24648387 cl 170 



NTID 



AAID 



12576 



NT - ;aa 
Length Length 

im — 



: Score Probability ' 



Protein name < 



Locus Name 



thymidylate synthase 



gp:L78665 



Acc# 
L78665 



Description 



Methylobacilius; tlageliatum aspartate aminotransferase (aat) /membrane \ " - ' " 
protein (prf-.l) , homoserine dehydrogenase (horn) , andthreonine synthase (thrC) 
thymidylate sythase (thyA) genes , complete cds . 


ORF Name NTID 


^ ^ TTA ' • NT AA 
AAID _ : — . ■ — . , Score 
Leny Lh • Length 


, Probability ' 


24796885_c2_197 557 


2577 v 268 


807 | 


801 


|1.2e-7-9 


Protein name '' 




Locus Name 


• Acc# ' 



sp:AMPM_ECOLI 



P07906 



Description 

M ET HION I NE AMINO PEPTIDASE , . (MAP) ■ (PEPTIDAS E M) 



212 



ORF .Name 


NTID AAID 


■ NT 
Length 


• AA 
Length 


Score 


Probability • 


25390711_ci_lB3 


|6B8 257« 


414 


1245 


593. . 


2.8e-60 


Protein name 


- ■ . i 




Locus 


Name 


Acc# 








sp:MUTY_ECOLI 


P17802 


Description 












A/G-SPECIFIC ADENINE GLYCOSYLASE, " 


ORF Name 


. NTID AAID 


NT 
Length 


AA 
Length 


Score. ■■ 


Probability 


2>412907_tI_4V • 


| |659 | |2579 


1 ^ 1 


|288 | . 


|74 | 


|0 . 044 



Protein name 



Locus Name 



hypothetical protein 



gp:AP000363 



Acc# 
AP000363 



Description 

Bacteriophage VT2^Sa, complete genome sequence . 



ORF Name 


NTID 


AAID. . 


■ NT 
Length 


AA 
Length 


Score 


Probability 


2bb66b7,7_tl_23 


660 


2580 


84 




252 






Protein name - 










Locus 


Name 


. Acc# 


Description 
















NO-HIT ' 


. ORF Name . 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


2558526;3J:2_64 ■ 


661 


2581- 


311 




926- 


|790 | 


|1.7e-78 . 



Protein name 



Locus Name 



diammopimelate. epimerase, 



pir: 501913 



Description 



ORF Name 



Acc# 

B65185 :S30 
699 :S01913 
:A3-7841:S2 



NTID. AAID 



26213-386 cT 242 



NT ' AA 
Length Length 
11215 



Score 



Probability 
2.0e-36 — 



Protein name 

Description 
OHIH PROTEIN, 



Locus Name 



sp:UBIH_ECOLI 



Acc# 
P2 5 534 



213 



ORF Name 



NT ID 



AAID 



25214428 c2 201 



NT AA 
Length Length 
235 



Score 



HIT 



Protein name 



Description 



Locus Name 



sp : LCT_SALTY 



Probability - 
|5.7e-7i ,~— 

Acc# 
Q07293 



PROLIPOPROTEIN D I AC YLGLYCERYL TRANSFERASE, 



ORF Name 



29303578 t2 74 



Protein name 
Description 
INO-HIT " 



NTID AAID 



NT AA 
— - , — , Score 
Length . Length 



Locus Name 



Probability 



AccJ 



ORF Name 



29380307 c2- 139 



Protein name 
Description 
(NO-HIT ; 



NTID AAID 



FF5~ 



' NT AA 
Length . ' Length 
166 ' I [HTTT" 



— " v! Score Probability 



Locus Name- 



Ac c# 



ORF Name 



30258885 c3 234 



Protein name 
, Description 
[NO-HIT — — 



NTID AAID 



NT • AA 

Length Length 

P3 . 7 



— , Score Probability 



Locus 'Name 



Acc# 

s 



ORF Name 



33626535 tl- 51 



Protein name 
Description 



'0-HiT 



NTID. .AAID. 



• NT AA 
Length " Length 
Ttl — 



Score Probability 



Locus .Name 



Acc# 



214 



ORF Name 



NTID 



AAID 



34416075 c3 231 



12553 



NT AA 
Length Length 
803 . I' 12412 



Score Probability 
. [1247 | |6.3e-127 



Protein name 



Description 



Locus Name 



sp:NfPRXjAZOVI- 



Acc# 
P36223 



TRANSFERASE) 


(URTDYLVL AMOVING tlN^VME) 








ORF Name 


NTID AAID 


IN -1 

Length 


AA' score 
Length 


Probability 


35425918_c3_241 .: j | |2589 


176 


R l |'P02 | 


|8 ..7e 


-27 ' 


Protein name \ 






Locus,' Name , 




Acc# . 


dihydro 1 o 1 at e 


reductase, 




pir:55233S 


V 


S52336 ■ 


Description 




■ 








ORF Name 


' NTID AAID 


NT 
• Length 


AA 

■ . — \ , Score , 
Length 


Probability 


360510.61_r.2_.7y 


670. 2590 


327 


984 r |930 | 


|2 .. 5e 


-93 ■' 


Protein name' 






Locus Name 




Acc# > 


probable 2 , , 






pir:G7087> 




■ G70875 


Description . 












ORF Name 


NTID AAID. 


NT " 
Length 


AA ■ Score \ 

Length ■ - 


Probability ■ 


3910943_cl_146 


■ 671 2591 


343 


• 1032 , |1278 | 


3 . 3e 


-130 . . 


Protein name 


■ : • 1 1 ' 




Locus Name/ 




Acc# . 


ketol-acid reductoisomerase^ ■ ' 




gp:AF125b6^ - 




: AF12 5563 



Description 



Neisseria meningitidis NMB putative aconitate hydratase (acn) , ornithine 
carbomyl trans f erase (argF) , and ketol -acidreductoisomerase - (ilvC) genes, 
complete cds . 



ORF Name 
3914002 13 114 



NTID AAID 



■ NT, ' AA 
Length . Length 

75T. 



Score Probability 



EZ1 



Protein name . 
Description 
[NO-HIT . 



Locus Name 



Acc# 



215 



ORF Name 



NTID AAID 



H9390S3 12 8 if 



T5^T 



NT - - AA 
Length . Length 
469 - I 11410 



Score 

ED 



Probability 
U.2e-61 — 



Protein name 



Description 



Locus Name 



sp;MURD_ECOLI 



Acc# 
P14900 



ADDING Enzyme) 


ORF Name 


NTID 


' AAID 


NT 

■ Length 


AA 
Length 


Score . 


Probability 


|334VV13_ti_5' . 


IP 7 *" 


. 2594 


| pso 


\ m 1 


I 594 1 


I.0e-5V : .| 



Protein name 



Description 



Locus Name 



sp:VAAA_HAkilN 



Acc# 
P43908 



HYPOTHETICAL 


PROTEIN HI09S4 










ORF Name 


NTID AAID 


•NT 
Length 


AA 
Length 


Score 


Probability 


3954638_12i63. 


[675 2595 


438 


r 1317 


|1067 | 


|7.5e-108 . 



Protein name 



Locus Name 



sp:DCDA_P3EAE 



Acctf 
PI 9 572 



Description .. . - * 

DIAMINOPiMELATE DECARBOXYLASE, (DAP. DECARBOXYLASE) 



ORF Name 



NTID AAID 



5962885 c3 219 



2596 



"mNT AA 

Length . Length 
181 



Score 



] [ 



Probability 
|2.8e-46 . 



Protein name 



Locus Name 



acetolactate synthase, " III small 
chain : ace tohydroxy- acid synthase III small 
chain 



pir :.YCEC3H, 



Description 



Acc# 

F6*4729:S14 
386 :.S40591 
: AO 11 14 :PS 



ORF Name 



NTID AAID 



3992035 ci 175 



NT ■ AA 
Length Length 
-] .11503 



Score 



[TiTT 



Probability 
4.6e-127 



Protein name 



Description 



Locus Name 



Isp : PTA_HAEIN' 



Acc# 
P45107 



PHOSPHATE ACETYLTRANSEEkASE, , ( PHOSPHOTRANSACEl'VLASE ) 



ORF Name NT ID AAID 


"NTT* 1 

Length '.' 


AA 

T — Score 
Length 


Probability ... 


414088i_tl;_40 578 2598 


467 • 


1404 |7i0 | 


Id . ie 


n n 

- / u 


Protein name' 






Locus Name 




Acc# 


NorM 




gp:AB010463 




AB010463 


Description 












Vibrio parahemolyticus gene tor 


No r M , c omp 1 ete 


eels ; ' - 






ORF Name ' .. ' NT ID • AAID 


NT 

Length, 


AA 

, — Score 
Length. 


Probability 


4I4V^_cI_lb9 ., |67$ . 


1 ^ 1 


|1572 | |i2W| 


|G.7e 


-1-23 | 


Protein name 






Locus Name 




ACC#; 








sp: YIFB^HAKIN 




P45049 


Description I , 












HYPOTHETICAL PROTEIN HI1117' . 


. ORF Name NT I D , '. AAID 


NT 
^Length 


AA 

— Score ' 
Length ■ 


Probability 


44-71887.__t3_lii |680 . 2600 


415 ; 


1248 - 637 


. 2. 8e 


-62 . . ; 


Protein name ■ , 






Locus Name 




ACC# ■ 


FtsW , f - 


gp:AF123260 


AF123260 


Description 












Coxiella burnetii FtsW (ttsW) gene, complete 


cds; 






ORF Name 1 NTID AAID 


NT 
Length 


AA n 
. ■ — ; , , Score., 


Probability' 




Length 






4720313_ci 180 |68i . |260i 


149' | 


|450 | 161 


l.-2e 


-10 ■ 


Protein name 






Locus Name 




-Acc# 






. sp : METE_ECOL I 




P25665 


Description 












(COBALAMIN- INDEPENDENT METHIONINE SYNTHASE) : . 


ORF Name NTID AAID 


NT 

Length \ 


aa- . . v 

, — . , : Score 
Length . - 


Probability 


4770887 t2_72 582 2602 


176 • >■ 


bJl |130 ; | 




-14 


Protein name 






Locus Name/. 




Acc# 


hypothetical protein. 




,-gp.:-SSO189J0 ' . 




Y18930 



Description 



Sultolobus soitataricus 281 kb genomic DNA t ragmen t # strain P2 . 



217 



ORF Name' 



NT ID 



AAID 



5100010 11 7 



F5T" 



NT . 1 AA ■ 
Length Length 
— 1 11002 



Score 



Probability 



Protein name 



Description 



Locus Name 



lsp:XERC_HAEIN 



Acc# 
P44818 



"INTEGRASE/RECOMBINASE XERC ' 








ORF Name ■' NT ID AAID \ 


NT 1 
Length 


AA 

. , Score 
Length 


Probability 


5275250_c3_2^5 | |€>Q4 | J2604 


1 -m | 


|609 | 528 


i.be-29 


Protein name 




Locus Name 


Acc# 


Trp repressor binding protein 




|gp:AF067083 


AF067083 



Description 



Vitreoscilla sp . * outer membrane protein homoiog gene, complete cds;Trp 
repressor binding protein gene/ partial cds ; and unknown gene's. 


nt AA 

ORF . Name NTID . AAID — . — ..- Score 

Length Length 


Probability ' 


593 802_r2_71 - | 685 . 2605 j 506 " 


1521 


433, 


l.le-40 • ■ ;. | 


Protein name . t • 


Locus Name 


' Acc# 



Description ,, , 



sp:RBN_HAEIN 



P44608 



RIB0NUCLEA3E BN, 


(RNA5E EN) 










• ORF Name ' 


NTID AAID 


NT ■ 
Length 


AA 

r Score, 
.Length 


•Probability, . \ 


5970933_ii_15 


.686 ' 2606 


|105 . 


318. . ' 2.45 | 


3.6e-21 


Protein name 








Locus Name 


Acc# 


unknown protein ;■ 


gprMSCTCWPA 


M15467 


Description - 










•M . tuberculosis 65 kDa .antigen . (ceil 


wall protein a) gene. 




ORF Name 


NTID AAID 


NT 
Length 


AA . . . 
— , Scores 
Length ■ ..- ■ 


Probability 


5988327_c3_227 


| 687 2607.; 




2784 1 11588 1 


|4.6e-163 ; 


Protein name 








"Locus Name 


Acc# 



sp :HEPA_ECOLT 



Description . 

RNA POLYMERASE ASSOCIATED PROTEIN (ATP -DE FEUDENT H.ELiCASE HE PA) 



P23852 :P75 
633 



.218 



ORF Name 



NTID 



AAID 



NT AA 
t T Score 
Length Length 



5498910- ±2 94 



Probability 
10.0050 — 



.Protein name. 



Locus Name 



|sp:7THR_EC0LI 



Acc# 
P32139 



Description "" , . . r , 

HYPOTHETICAL 34.0 KD PROTEIN IN GLNA-RBN INTEkGENIO REGION 



ORF Name 



NTID AAID 



NT . 
Length 



AA 

t — ^ Score 
Length — ; — — 



Probability 



6bl68BS_c2_lttb . | 599 2609 - 


213 | 




642 


I14U 1 


p.5e 


-09 


Protein name 






Locus Name ■ 




ACC# 


putative membrane protein. 




|gp:SC6D7 




AL133213 


Description 














Streptomyces , coelicolor cosmicl 6D7 


ORF Name NTID.' AAID 


NT 
Length 


AA ; > 
Length 


Score 


Probability 


565876_c3_245 69.0 2610 






1659 . 


215 


l.le 


-16 


Protein name ... . ; . • • 






Locus Name 




Acc# 



Description 



sp : 0MPA_B0RAV 



Q05146 



OUTER MEMBRANE 


PROTEIN A PRECURSOR 
















ORF Name 


NTID AAID 


NT 
Length 


AA - ' 
\ — . , Score 
Length 


Probability. 


8575_c3_236 


691 2611 . 


14 0 


423 




4 .Oe 


-31 


Protein name . 








-Locus Name 




Acc#'. 




i 1 1 






sp:CH10 


_PSEST 




033499 


Description 
















10) . . . , 




ORF Name 


NTID AAID 


NT 
Length 


AA 

— , Score " 
Length 


Probability 


859389_r2_96 


692 2612 = 


342 


1029 


537 


l.le 





Protein name 

Description 
(PPP SYNTHA^) 



Locus Name 



sp.:-ISPA_HAEIM 



Acc# 
P4 52 04 



ORF Name 



NT ID 



AAID 



NT AA 
t ~~ t* — , i Scor e 
Length Length 



970625 cl 145 



Probability 
1 . le-14 



Protein name 



Locus Name 



Sp:ILVI_ECOLI 



Description : . V . 

Ill) (ACETOHVPROXY-ACID SYNTHASE ill LARGE SUBUMIT) (ALS-ITi) 



Acc# 

P00893 :P78 
045 



• ORF Name 
9957828 £2 88 



NTID 



AAID 



NT 
Length 
~ 



AA • • : 

■ Score Probability 
Length — - — * 



T5TT 



10 . 00086 



Protein name 



Locus Name 



transposase 



gp:CETC2 



Description 



Acc# 

X59156 :S88 
451 



Caenorhabditis elegans transposon Tc2 .. 


ORF Name NTID • AAID- 


NT, 
Length 


AA 

- — . Score 
Length . ■ 


Probability 


|2i736625_c2_ii 595 | 2615 


1290 


870 | |508 | 


|l.ie-48 | 


Protein name ,' • 




.Locus "Name 


' Acc# 






sp:A&M_HAEIN 


P44751 . ; 


Description . \ , 








(DI ADENOSINE TETrApHOSPHATASE ) , ( " ' . ( V 


ORF Name . NTID AAID - 


NT" 
Length 


AA 

— - Score 
Length 


Probability. s 


26053825 11 ,4 . • 696, 2615 


525 | 


1587' ' [1002 | 


|5:8e-101. ■:. ' 



Protein name 



Description 



Locus Name 



sp : DNABECOLI 



P03005, 



IMPLICATIVE UNA HELICASE , 


" ORF Name NTID AAID 


NT 
Length 


AA J\ 
T — . , Score 
Length 


Probability 


33984677_13_6 697 ■ 2617 


1 , 


1146 • 555 


I:4e-53 


Protein. name • 




' Locus Name 


' Acc# : 


biosynthetic alanine racemase 




gp:AF16b882 


AF165882 


Description - 


Pseudomonas aeruginosa biosynthetic alanine 
cds . 


racemase (air) 


gene , complete 



220 



ORF Name 



NTID 



AAID 



5181430 c3 14 



NT 
n 
777 



AA 

, t — ^ Score 
Length . Length 



Probability 
3 . 6e-60 " 



Protein name 



Locus Name 



Acc# 





I* . 




sp : MdXA_HCOLI 


P19624 


Description 










PYR1D0XAL PHOSPHATE 


BIOSYNTHETTC 


PROTEIN PDXA 






ORF Name 


NTID AAID 


NT AA 
— . , , — . , Score 
Length Length 


Probability 


|76S0_c3 15 j 


699 | |2619 


;.. |295 ; ,- |8 


38 |603 | , 


|l.le-bB 



Protein name 

Description . 
DIMETHYLTEAN5FEEA3E ) 



Locus Name 



sp : KSGA^ECOLl 



Acc# ", 
P06992 



ORF Name 



NTID 



AAID 



10547155 cl 79 



7TJTT 



2620 



NT 

[Tu^F 



AA 

, — Score 
Length Length - — = — 

14176 



3291 



Probability 
]0Tu "~ : — 



Protein' name.' 



Locus Name 



. carbamoylphosphate synthetase large subunit 



Igp : PAU81259 



Description 



Acc# : ' 

U81259 :L27 
528 



Pseudomonas aeruginosa dihydrodipicolinate reductase (dapB) gene, partial 
cds, carbamoylphosphate synthetase ^small subunit t (carA) andcarbamoylphosphate 
synthetase large, subunit (carB) genes, completeciis , " and ,Fts J homolog (ftstl) 
gene, partial cds . '. 1 * ■ . . 



ORF Name 

■f 



NTID 



AAID 



I1191S953 ci-76. 



TUT 



T£TT~ 



NT . AA 
Length Length 
139 



Score 



Probability 
17 . 5e-2« . 



Protein name 



Locus Name 



probable oxidoreductase 



pir:T358E>3 



Description 
• ORF Name 



NTID 



Acc# 
T35853 



NT AA i 
A AID — ^ ' — ' Score? v Proba bility 
~~ Length Length — — — — -. x 



13947152 t-2- 37 



TUT. 



EES' 



|381 | - 



Protein name 
Description . 
[MO-HIT : 



Locus Name 



Acc# 



ORF Name 



NT ID AAID 



14094052 tl 20 



7uT" 



2623 



NT - 
Length 
111 



AA 



T Score Prob ability 
Length — — ^ — ; • ^ 



|9.7e-12 



Protein name 



Locus Name 



sp : YeCR:_ECOLI 



Description • ' 

HYPOTHETICAL- 12 . 4 PROTEIN IN HELft-SERT INTERGENIC REGION 



Acc# ^ 

P45572 :P75 
878 



ORF Name, 


NTID AAID 


NT 
Length 


AA 

■ — , , Score 
Length 


Probability 


15501077_12_30 


' | 704 2524 


91 


282 |103 | 


|1. le-05 ( 


Protein name 








Locus Name ' 


Acc# 


hypothetical protein . . • . , 


gp:SSU18930 


Y18930 \. 


Description 












Sullolqbus soltataricus 281 kb genomic DNA rragment,, strain P2 . . 


ORF Name 


NTID ' AAID 


NT AA 
— — — ■ Score 
Length Length 


Probability 


i9552585_i3_55 : 


705 | |2525 


309 


930 530 


|5.0e 51 . 


Protein name . 








Locus Name 


Acc#- ■ 










sp:CBL_ECOLT 


Description 










y4 / Ub j : P / b 












353 % . 


TRANSCRIPTIONAL 


REGULATOR GEL 










ORF Name 


NTID AAID . 


NT 
Length 


AA . . 
— — , Score 
Length. 


Probability 


19571925_c3_117 


705 j |2525 


70- . 


213 




Protein name 








Lqcus Name 


. Acc# 


Description 












NO-HIT . ■ • • , 


ORF Name 


NTID AAID 


NT 
Length 


AA ■ . _ 
— - Score 
Length 


Probability 


203l?325_t2_40 


| 707 2527 


55 


198 





Protein name 
Description 



Locus Name 



Acc# 



NO-HIT 



222 



ORF Name 



NTID 



AAID 



20488827 11 10 



NT AA 
Length v Length 
558 Z I 11677 



Score 



T£T8~ 



Probability 
2..3e-168 



Protein name 



Locus Name 



sultite reductase 



gp ^'026066 



Acc# 
AF026066 



Description 



Pseudomonas aeruginosa sultite reductase (cysl) gene/ complete cds . 


ORF Name 


NT AA 

NTID AAID , — . ■ — , ■ Score : Probability 
Length Length — - — = — =— f- 


20953402_c2_103 ; 


709 2629 | |328 ; |987 | 213 -j 3.2e-15 



Protein name 



Locus Name 



'spiLPPBiHAEIN 



• ' ; Acc# 
P44833 



Description ' £ 

OUTER MEMBRANE ANTIGENIC LIPOPROTEIN B PRECURSOR 



ORF' Name 



NTID AAID 



21907078_13/b5 


7i 0 


2630 









NT AA , 

Length Length 

166 ' 



Score Probability , 



Protein name 
Description 
[NO- HIT — " " 



Locus Name 



Acc# 



ORF Name 
22766067 11 ,21 



NTID AAID 
] [7TT 



2631 



NT : AA 
Length Length 
Tm — | v 11437 



— ■ Score 



Probability 
10.012 — 



Protein name 



Description 



Locus Name 



sp : THDt' JWCGE 



■ Acc# .' 

P47254:Q49. 
330 



j POSSIBLE THIOPHENE 


AND FURAN -OXIDATION PROTEIN THDF 






ORF Name 


NT AA , ■ 
NTID AAID -■■ . — t ■ ' — ■ 
Length Length 


Score . 


Probability 


24251S10_c2_88 


712 2632 |364 • 1095 


. .746 - : 


; . 7.8e-74 



Protein name 



Locus Name 



hypothetical protein Rv3629c 



pir:F70561 



v Acer 
F70561 



Description 



223 



ORF Name 



NTID 



AAID 



NT AA 
■ - — • — - . Score Probability 
Length - Length — — — •■ 



24259555 12 39 



7TJ- 



Protein name 



Locus Name 



similar to glutatnione-s- transferase . 



|gp:AF036940 



Description 



1.6e-34 



Acc# ■ 

AF036940 :A 
F081362. 



Pseudomonas sp. U2 plasmid pWWU2 , f erredoxin reductase 1 ' . ■ ■ ■ ■ 
(nagAa) /salicylate- 5 -hydroxylase large oxygenase component 

(nagG) , salicylate- 5-hydroxylase small oxygenase component (nagH) , f erredoxin 
(nagAb) , naphthalene dioxygenase large oxygenase component (riagAc) , ' .' 
naphthalene dioxygenase small oxygenaseeomp'onent (nagAd) , cis- naphthalene 
dihydrodiol dehydrogenase (nagB) ■ salicylaldehyde dehydrogenase - 



ORF Name 



NTID 



AAID 



'NT AA ■ 

— ~ . — ' ' Score 

Length Length - — :. 



24801562 c2 82 



7U 



TT5"4~ 



Probability 
18 :0e-73 



Protein name 



permease tor AmpC beta- lactamase expression 



Locus' Name r . 
gp:AFu82985 ; 



Acc# 
AF082 98 5 



Description 



Pseudomonas aeruginosa "permease tor AmpC beta- lactamase express lonAmpG „ 
(ampG) gene, complete cds ; and- unknown gene .. 



*ORF Name 



NTID AAID 



NT AA ' ' . ' . 
: — ■ , ■- Score Probability 
Length . Length ■• — ^ — — - — : — - * 



25448412 c2 9.5. 



TUT 



2535 



Protein name 



297 | |186 | |1.7e-14 , ~ 

Locus Name ' • . Acc# 



unknown 



|gp:AF033858 



AF033858 



Description 



Pediococcus pentosaceus strain ATCC43200 plasmid pMD136 , completeplasmid 
sequence. . . ■ . 



ORF Name 



29429590 12 44 



NTID, 
1715 



AAID 



NT AA 
Length Length 
TF2 



Score 



2TT 



Probability 
12 . 5e-20 — 



Protein name 

Description 
PR'OTETNVDLD' 



Locus Name 



sp:VDLD_HE'LPY- 



Acc# 
005729 



224 



ORF Name NT ID AAID 


JN 1 

Length 


AA 

T — ; , Score ■ 
Length 


Probability 


30181587_rl_9 [717 2537. 


1 76 • 


2 


31 |43 | 


jO. 037 


Protein name 






Locus Name 




Acc# ' 


bone morphogeny tic protein 2 




pir:A6138V 


A61387 


Description 












ORF Name NTID AAID 


NT 

Length 


AA . 
T — . , Score 
Length 


Probability 


3129537 c2 94 718 2638 


323 


972 _ |577 | 


|6 .3e- 


-56. 


Protein name 






Locus Name 




" Acc#\ . 


; mrr restriction system protein 






pir:F75508 


F75508 


Description 












ORF Name NTID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability . 


3137521213 54 . \ '719 - 2539 


615 


1848 | |1278| 


|3.3e 


-130 


Protein name 






Locus Name • 




Acc# 








spiYPBQ^BCGLI 




P77727 


Description 












PROBABLE AMINOTRANSFERASE YFBQ, 


ORF Name ' NTID AAID 


NT 
Length 


AA 

, — ; • Score 
Length 


Probability 


32878 _12_25 720 |2640 . 


349 


1050 | 176 


|1 :2e 


-10 ■ • 



Protein name 



Description 



Locus Name 



ECO110K 



■ Acc# > 

D10483 : Jpl 
5 97 : JO 16 8 3 
: J01706 :K0 



E.'coli K12 


genome, 


0-2.4mm. region'. 












ORF Name 




•NTID AAID 


, NT 
Length 


AA 
Length 


Score • 


Probability . 


33788387_c3_ 


108 , 


|72i |2641 


66 | 


201 


r i 


|0. 00053 | 



Protein hame 



Locus Name 



hypothetical protein SPCP31B10.02 



E 



ir:T41692 



Acc# 
T41692 



Description 



225 



ORF Name 



NT ID 



AAID 



391^688 cl 7tf 



TIT 



NT 
n 

TUJ5 



AA 

t Score 
Length Length — 



Probability 
7.4e-46 ; " 



Protein name 



Description 



Locus Name 



sp:Y3i8_HAEIlT 



Acc# 
P43984 



HYPOTHETICAL 


PROTEIN HI0318 










- ORF Name 


NT ID 1 AAID 


NT 
Length 


AA ' 
Length . 


Score 


Probability 


40875 JiljLti 


■ | |723 [2643 


126 ■ | 


J81 


I 185 1 





Protein name 



Description 



Locus Name 



sp:YHEN_ECOLI 



Acc# 
P45532 



HYPOTHETICAL 13 


5 KD PROTEIN IN RPSL 


- FKPA INTERGENIC REGION 




ORF Name 


NT ID AAID 


. NT AA 

— — ' Score 
Length Length 


.Probability 


4490902_c2_101 


724 : . 2644 


166 | 501 522 


4.3e-50 



Protein name 



Description 



Locus Name 



sp- GREA__ECOLI 



Acc# 

P21346 :P78 
: 111 



. GREA) • ; - - : - 


ORF Name ' 


" NT ID 


AAID 


NT 

"•• Length " 


AA o 
— Score 
Length 


Probability 


4536668 c3 107 , 


725 


. 2545 


940 


2823 • 1554' 


1.9e-159 


Protein name 








Locus Name 


Acct, 



Description . . 



sp : GLNE_ECOLI 



P30870 :P78 
10 7 



SYNTHETASE AimJYLYLTRANSFERASE) (ATASE) 



ORF Name 



NTID 



AAID 



4572125 ci 72 



NT ' AA 
Length Length 
311 



Score 



Probability 
ll.le-81 : 



Protein name 



Locus Name 



unknown 



gp:PAU63816 



Acc# 
U63.816 



Description 



Pseudomonas aeruginosa glnE gene, partial cds ; HvE , ADP-heptose : LPS 
hep€osylfcransf erase homolog (waaF) , lipopolysaccharide heptosyltransf erase I 
homolog (waaC) , giucosyltransf erase I homolog (waaG) , RfaP protein (waaP) ,. 
andunknown protein ; (waaX) genes, complete cds; and inaA gene, partialcds. 



ORF Name 


NTID 


AAID 


NT ' 
. Length ■ 


AA 

m - — , , Score 
Length 


Probability 


5084827 13_48 


727 ■■ 


264 V 


153 | 


492 193 | 


1. 5e 


-14, 


Protein name. 










Locus, Name 




Acc# 


i 










sp : SURA_ECOLI 


P21202 :P7,5 
630' 


Description . • 














SURA) , (PPIASE) 


(ROTAMASE 


ej 












. ORF Name 


NTID . ' 


AAID 


NT. 
Length 


aa Score '■ 
Length . 


Probability 


5287555_c2_102 


• 728 


|2648 


125 

- f ■ 


|378 | 125 


5.0e 


-08 



Protein name 



Description 



Locus Name 
|sp:YC53__HAEIN 



Acc# " 
P4413 9 



HYPOTHETICAL PROTEIN HI 12 5 3 


.ORF Name NTID, AAID . 


NT ' 
Length 


AA 

— Score 
Length 


Probability 


5322062_cl_69 729 2649 


255 




798 , 449: 


2.3e-42 



Protein .name ••' - 

Description 1 . . 

MOLYBDOPTERIN BIOSYNTHESIS MOEB PROTEIN 



Locus Name ; 



sp:M0EB_EC0LI, 



Acc# 
P12282: 



ORF Name 



NT ID AAID 



682220&c2 U3 



7Tu~ 



• NT AA 
Length Length 
296 



Score 



TOT" 



^Im- 



probability 
5.6e-48 ' 



Protein name 



Description 



Locus Name 



sp:HEMK_ECOLI 



, Acc# 

P37186 :Q46 
754 



HEMK PROTEIN 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


6829637_t2_38' 


. 7 


2551 


.455 




1368 


. I 926 1 


|6.6e-93 



Protein name 



Description 



.ORF Name 



Locus Name 



sp:CYSG__ECOLI 



Acc# 

P11098 :P76 
.685 



NTID AAID 



7117182 c3 114 



2652 



NT, AA 
Length Length . 
115? 



Score •': Probability 
[200 | |b.6e-16 



Protein name 



Locus Name 



unknown 



gp:AF033858 



Acc# , 
AF033858 



Description 



Pediococcus pentosaceus strain ATCC43200 plasmid pMD136, compietepiasmid 
sequence. . • . . •" '"' ■.. 



ORF Name 



NTID AAID 



7281552 cl 77 



7TT 



NT AA 
Length Length 
416 I 11251 



Score 



1W 



Probability 
16 .4e-130 



Protein name 



Description 



Locus Name 



|sp:CARA_PSEAE 



Acc# 
P38098 



PHOSPHATE SYNTHETASE GLUT AMINE CHAItt) 


: NT 

ORF Name , NTID -AAID . . T — 
• • - Length 


• AA 
Length .. 


Score 


Probability 


953i81_c3_i0 ; 6 73,4 2654 450 | 


1353 


; 449 . 


. 2.3e-42 ; 



Protein name 

Description 
HYPOTHETICAL, 54 . 8 KT> PROTEIN CY20H10.28C. 



Locus Name 



sp.:Y16S_MYCTU 



Acc# 
P96936 



2 28 



ORF Name , 



NTID AAID 



978465 11 1^ 



7T5~ 



NT , ' AA 
Length > Length 
102 



Score • Probability 



"JUT 



Protein name 
Description 

pro- hit 



Locus Name 



Acc# 



ORF Name 



NTID. AAID 



NT AA ,/ 
— - — 7 Score 
Length Length — 



15985542 c2 43 



2555 



TZT 



Probability 
9.7e.-08 



Protein name 

Description 
GLUTAMAT'E RACEMASE-, 



, Locus Name 



sp :MUR1_SYNY3 i 



ACC# 
P73737 



ORF Name 



NTID 



AAID 



185185 c3 54 



7TT 



2557 



NT . v AA 
Length Length 
233 ,., | . |702:- 



Score Probability 



Protein name 
Description 
INO-HIT 



Locus Name 



Acc# 



ORF Name 
23628411 tl 5 



NTID 



AAID 



NT . AA 
Length .." Length 
295 | 1888 : 



Score 



TTT 



Probability 
|5.1e-35 " 



Protein name 



Description 1 



Locus Name . 



sp : YCHB_J!COLI . 



Acc# 
P242 0 9 



HYPOTHETICAL 30 . 9 KD PROTEIN IN HEMM-PRSA INTERGENIC REGION 


ORF Name NTID 


NT . AA 
AAID — • ■ , — Score 
Length Length 


Probability 


29494003_12_13 ■ 739 


2559 , 572 - 2019 ■ 241 


5.5e-17 v 



Protein -name 



Locus Name 



sp:YHE3_PSEAE 



Acc#. \ 
P42810 



Description 

HYPOTHETICAL .54.8 KB PROTEIN IN HE^-HEMA INTERGENIC REGION ■ (0RE3 ] 



229 



ORF Name 



NTID AAID 



29960761 ci 30 



73TT 



NT AA 
Length "' Length 
TZZ ~| [1407 



Score 



TTT 



Probability 
3.0e-72 



Protein name 



Description 



Locus Name 



sprHEMl PASMU 



Acc# 
P95525 



- GLUTAMYL- TWJA RUDUCTAaU , 


(GLUTR) 








ORF Name NTID 


aaid ; 


NT 
Length 


AA ■ ' 
•j- — . ■, Score 
Length 


Probability 


3129li51_t2_15 ||741 




p S 5 | 


■798 |924 | 


|l.le-92 | 


Protein name 






Locus Name " 


• Acc# 








|gp:ECOPRS • 


M13174. 


Description 










E.coli prs gene encoding phosphprxbosylpyrophosphate synthetase, complete 

cds . •- V ' ' 


ORF Name NTID 


AAID 


-.' NT, 
Length 


AA ' 
T — . v Score 
Length 


Probability 


346413082t3__23 , | |742 : 


| 2662 




591 | [160 | 


|9 .7e-12 ' 



Protein . name. 



Description 



Locus Name 



sp : LOLB_PSEAE 



- Acc# 
P42812 



OUTER. MEMBRANE 


LIPOPROTEIN LOLB PRECURSOR 




ORF Name 


1 - ' ' NT 
NTID.' AAID ■ -.- — , 
. Length , 


AA 

. Score Probability . 
Length — - — 


4869025_t3_iy 


. • ' 743 2663 554 


1665 |1876 | |1.4e-133 ; 


Protein name - 




Locus 'Name ■, Acc# 






sp:ETFD_ACICA P94132 


Description 






DEHYDROGENASE) 


( E LE C TRON - TRANS FERR I NGt FLAVO PROTEIN DEHYDROGENASE) , ( , . 


ORF. Name 


NT 

NTID. AAID — , 

Length 


AA ' 
„ — Score - Probability ■ 
Length ■ i. - ---- - ' • 


10552i53_rl_31 


744 .2664 74- ■ | 


; 225 ; 168 , 7.1e-12 


Protein name - 




Locus, Name - ■ Acc# 



|gp:AB028868 



Description . 
Mus musculus P4(21)n mRNA, partial cds. 



AB028868 



230 



ORF Name 



NTID AAID 



NT AA ■ , . ,^ 
T — _ Score Probability 
Length Length — ■ — - — — - — ~ JL 



10722125 12 65 



7^" 



2665 



T51T 



165 I |2.9e-12 



Protein name 

Description 
SMALL "PROTEIN A 



Locus Name 



sp : SMPAJSCOLI 



Acc# 
P23089 



ORF Name 



NTID AAID 



NT AA . , . . 
■ — _ — _ Score Probability 
Length Length — — — : — : — 2 - 



117037 13 138 



6"57~ 



375 I |1.6e-34 



Protein name 

Description 
HYPOTHETICAL PROTEIN : HI 078 7 



Locus Name 



sp:V787_HAEIN 



Acc# 
P44052 



"ORF ; Name 



NTID 



AAID 



NT AA 

— ~ — Score 
Length Length ; — 



11885927 11 51 



7¥7~ 



1266 7 



7F" 



\TTT 



Protein name 
Description 
IN0-H1T — 



Locus Name 



Probability 



Acc# : 



ORF Name 



NTID 



AAID 



NT . AA ■ 

— , . , — . Score Probability 
Length Length — -.■ .-■ — — 



12297203 c3 236 



7W 



2TT 



1138 I |2.1e-09. 



Protein name 



Locus Name 



hypothetical protein V APE2 061 



pir:G72510 



Acc# 
.G72510 



Description 



ORF Name 



12506635 12 83 



Protein name 
Description 

NO-HIT ?l 



NTID 



AAID 



7T9~ 



NT AA 
Length Length 




Score Probability 



Locus Name 



Acc# 



231 



ORF Name 



NTID AAID 



112697037 c'A TUT: 



NT ' AA 
Length Length 

HI I 



Score Probability 

p-:fle-0b ~ • 



Protein name. 



Description 



Locus Name 



gp : t>A»LE>H 



. Acc# 
X70925 



P.acidilactici gene tor d- lactate 7 dehydrogenase . . - 


NT AA 

ORF Name NTID AAID — — 

Length Length 


S core . Probabi 1 i ty 


13078416__c3_233 751 2571 167 ■ ' |504 


' |210 | |4:9e_-17 



Protein name' 



Locus Name 



ribosomal -protein- serine . • " ■ ■" 
N- acetyl transferase, rimL homolog 



ydaF 



pir:E697S8. 



Acc# 
F69768 



Description 
ORF Name 



NTID - AAID 



134635 c2 225 



NT 
Length 
147 



AA 
Length 
1444 



Score 



5T9~ 



Probability 
6.7e-52 ~ 



Protein name 



Locus. Name 



terric uptake regulator 



IgpiABDNAPtm 



Acc# , 
Y14980 



Description 



Acmetobacter baumannii tur gene. 








'■ . i .ii 


7 ORF Name ; NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


13726003_c2_i96 ; 753 . 2673 . 


345 


10 5 Or 


736 


S.5e-73; . v ■■■■ 



Protein name , 



Locus Name 



iron transport protein : protein 
slrl295 :protein. slrl295 



pir:S74651 



Acc# 
S746 91 



Description 
ORF Name 



NTID AAID 



138-70462 13 144 



12674 



NT 
Length 
ST — 



Score 



AA 
Length 
|2 46 ; | [ITT" 



Probability 
|3.5e-07 



Protein name 



Locus Name 



hypothetical protein jhpii63 



pir :B71840 



Description 



Acc# 
B71840 



ORF Name 



NTID AAID 



14553432" "£2 57 



75T~ 



12675 



NT AA 
Length Length 
JUT— 1 



5TT 



Score Probability 



|2.4e-63 ' 



Protein name 



Locus Name 



sp : METR_SALTV 



Acc# 
£05984 



Description ' 
TRANSCRIPTIONAL ACTIVATOR PROTEIN METR 



ORF Name 
[Iba203i2_c2_223 



NTID AAID 



NT AA o • ; ., , . 
— ^, — =- Score Probability 
Length Length — — = — L - 



2 p604 | [1993 | [; 



|2 ._2e-195 



Protein name 



Locus Name 



UspA2 



Description • 



E 



AF113611 



■ Acc# ,-■ 
AF113611 



Moraxella catarrhalis strain V1171 UspA2 (uspA2) gene; compietecds. 


■ NT • AA 
ORF Name NTID AAID — -; ■. — , Score 

Length Length 


Probability 


16016?26_t3_137 757 ; 2677 237' 


714 197 


j.ie-15 



Protein name 



Locus Name 



growth tactor-responsive protein, vascular 
smooth muscle :SM- 20 ' 



pir : Ab3770 



AccJ, 
A53 7 70 



Description 
,ORF Name 



NTID - :AAID 



NT-, 
Length 



160640S1 e2 187 



[T51T 



AA • ' 
Length 
- 11154 



Score 



Probability 
|5:5e-73 _r_ 



. Protein name 

Description 
mi 2.4.1. -) 



Locus Name 



Acc# 
P45065 



ORF Name 



NTID ; AAID 



lifel7190b- c2 202 



NT 
Length 

74 . : , 



AA 
Lengthy 
1225 



Score ' Probability 



Protein- name 
Description 
(NO-HIT ... 



Locus Name 



Acc# 



233 



ORF Name 


NT ID. 


AAID 


. NT: 
, Length; 


. AA 

1 r '^\_iv Score 
-Length 


Probability 




760 


2680 ■ 


765 


2298 |2390| 


4.8e 


-248 


Protein name 


... i' 






Locus Name 




Acc# 










sp : IDHAZOV* 




P16100 


Description 














DECARBOXYLASE) (IDH) .... 


ORF Name 


NT ID 


AAID 


NT . 
Length 


AA ".: ■" 
— , Score 
Length 


Probability 




761 


2681 


| , 


|1467 |10B7 | 


8.6e 


1 


Protein name 








Locus Name 




ACC# . 


' hypothetical protein F3 2D8 






pir:T21659 




T21659 


Description : 














ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

. — . , Score 
Length 


'Probability. - • 


20080651_clJL71 


762 


2682 


, 299 


|900 |330 | 


p.4e 


-30 



Protein name 



Description 



Locus Name 



sp:YJJV_ECOLl 



• Acc# . 

P39408 :P78 ' 
143 



HYPOTHETICAL; 28 


9 KD PROTEIN IN.OSMY 


-DEOC IMTtitttiENIC; REGION 




ORF Name'. 


NT ID .AAID 


NT AA- 
— " ■• • — , Score 
Length Length . . . \ 


Probability . 'j. 


20b09628_c2^204 


763 2683 


378 


1137 . |891 | 


|3.4e-89. 


Protein name 






Locus Name' 


- , ."' Acc# " 



Description . 



|sp:FT5Z_EC0LI 



P06138 :P78 
047 :P77857 : 



CELL DIVISION. PROTEIN FTSZ ■ 


ORF Name NT ID AAID 


,' NT 
Length 


AA 

„. — Score 
Length .• 


Probability 


2132006_c2_218 764 . 2684 


132 




399 153 


2v4e.-10. 


Protein name '. 






Locus Name 


Acc# 


nypotnetical protein siii830 






. pir : yVb232 


S75232 


Description 











234 



ORF Name 



NTID 



AAID 



22048442 12 9b 



2685 



NT 
n 



aa. t ' 

^ T — 4- v. Score 
Length Length ; — ; 



TZUT 



] [ 



TuTTF 



Probability 
|8.2e-102 



Protein name 

Description • 
SUCCI1T7L-DIAMIN0PIMELA ( 3 DAP ) 



Locus Name 



sp:DAPE_HAE±N 



Acc# 
P44514 



ORF Name 



NTID 'AAID 



22322956 13 .130 



NT 
Length 
ZT~ 



AA 



Protein name 



Score Probability 

Length — ^ — : 

|186 | [109 | |6.5e-06 

Locus Name Acc# 



hypothetical protein PH0221 



)ir:D71245 



D71245 



Description 












ORF Name 


NTID 


AAID 


NT 
. Length 


AA 
Length 


Probability 


22463'31I_t3_i28 ■ 


. 7.67. 


2687 


'103 .1 


|312 




Protein name 








. t • 

Locus Name 


Acc# 


Description 












NO-HIT : ■ , 


. " ORF Name' 


NTID 


AAID 


NT. 
Length 


AA Y. 

— Score 
Length 


Probability 


22734807_tll40" 


768 


2688 


97; 


294 , 




Protein name 








Locus Name 


. s{ Acc# ' 


Description 












NO -HIT .. 










' ii •• 


ORF Name 


. NTID 


AAID 


NT 
Length 


AA . ' 
; — , \ . Score 
Length ■ 


\ - ■•; • 
Probability - 


22930306_Cl_JL70 , 


769. 


2689 


354 


|1065 | 427 | 


5.0e-40 



Protein name 



Locus Name 



5 1 -nucleotidase 



|gp:CLI131243 



Description 
Coiumba ,livia mRNA tor 5 ' -nucleotidase : 



Acc# 
AJ131243 



ORF Name . 


NTID ; 


AAID 


NT 
Lengtn 


AA 

, — \- , Score 
Length 


Probability 


23445i36J:2 60 


77.0 * 


2690 


583 ; 


1752- |752 | 


|4.3e-120 


Protein name 








Locus Name 


Acc# 


NH (3) -dependent NAD [+) syntnetase 




pir:G72277 


G72277 


Description 










' . '• ' • 


ORF Name 


NTID 


AAID 


MT 

Length 


. ■ — , Score 
Length 


Probability 


|23532300_cl_I65 


1 771 


2691 


1 


I 237 1 




Protein name 








Locus Name 


Acc# 


Description 












NO- HIT ;./,< • • • , '■ 


, ORF Name 


NTID ■ 


AAID 


. NT 
Length 


AA 

_ — Score 


Probability ' ' 








Length 




23556625_c3_245 


|772 


2652 - 


241 | 


|726 . 185 


|2.2e-14 


Protein name - 








Locus Name 


Acc# 










sp : FT5Q_EC0L.I. 


P06136 ' 


Description 












CELL DIVISION PROTEIN FTSQ ' .,, 


ORF Name 


NTID 


AAID 


NT, 
Length 


f aa 
Length . 


Probability 


23615832_ii_16 . 


773 




334 


\ 1005 ,125 


9.1e-09 


Protein name; ' 








Locus. Name 


. ACC# .. 


lysophospholipase 


homo log 






pir:T02661 


T02661 . . 


Description 












ORF Name ■ ! 


NTID 


AAID : 


NT 

Length 


AA 

, — . ; Score 
Length 


Probability 


|241110i5_c3_265i 


• | 7 ^ 4 . ■ 


2694 


■ ±B4. | 


465: | |274 | 


|8.1e-24 


Protein, name 








Locus Name 


ACC# ' 



Description 



sp : RIB^ECOLI 



P08391:P75 
621 



236 



ORF Name 



NT ID , AAID 



242190b6 ci 184 



775" 



NT AA 

r ~ "^-u t — ^ Score 

Length Length 

2^" 



7TT~ 



"5TT" 



Probability . 
4 . 9e-49 



Protein name 



Description 



Locus. Name 



sp:YRT5J?SEAE 



• Acc# 
P24562 



HYPOTHETICAL -24 


b KD PROTEIN IN PILT 5 ' REGION (ORF5) 






ORF Name 


NT AA 
NT ID AAID — — - . 

Length Length 


Score 


Probability 


24220786_tl_4 


| 776 2696 | 570 .1115 | 


835 | 


•2,9e-83 



Protein name ■ 



Description 



Locus Name 



sp : PILT_PSEAE 



Acc# 
P24559 



TWITCHING MOBILITY 


PR0T1I1N 










ORF Name 


NT ID AAID 


, ■ NT 
Length 


AA o . " 
r — . , Score 
Length 


Probability 


24250928_cl_153 


777 2697 


• 78 




237 |207 | 


|1.0e-16 


Protein name 








Locus Name 


" Acc# 



sp:YFHJ_ECOLI 



P37096 



Description - ;i ■ • 

HYPOTHETICAL 7.7 KD .PROTEIN IM PPEB^FbX INTERGENIC REGION 



.-; ORF Name 


NT ID 


. AAID 


NT 
Length 


AA 

„■ — , .Score 
Length 


Probability 


242S52G0_c2_229 .. - 


778; 


2558 ■ 


115 | 


348 




Protein name 








Locus Name 


Acc# '■: 


Description 












NO-HIT 














ORF Name 


NT ID 


AAID 


NT 

Length 


AA 

m "■ Score 
. Length - ■ 


Probability 


24412562_c2_18? 


|779 




| 314 


|945 740 | 


3 .4e-73 


Protein name 








Locus Name 


Acc# . . 



sp:DDL_HAEIN 



P44405 



Description 

D- ALANINE ALANINE LIGASE, (D-ALANYLALANINE SYNTHETASE) 



237 



ORF Name 



NTID 



AAID 



2441b875 ci 174 



NT AA 
Length Length 

322 -I 



Score 



Probability 
|6.0e-83 ~~ 



Protein name 



Description 



Locus Name 



sp:HEMZ_ECOLi 



Acc# 

P23871 :P78 
2 32 



SYNTHETASE) . V 


ORF Name 


NTID' 


AAID 


NT 
Length 


AA 

— , . Score 
Length 


Probability ' ■■ 


24544577_t2_96 


781'. 


2701 


535 


1608 923 


1.4e 


-92 


Protein name 










'Locus Name 




ACC# ';■ 


hypothetical protein 




pin 576051 


S76051 


Description 
















ORF Name , 


NTID 


AAID 


NT 
Length" 


AA 

, — Score 
Length 


.Probability 


248i316i_r3_i43 


782 , 


2702 


344 | 


|1035 |650 


1 l 12e 


-63 


Protein name 










Locus Name 




Acc# 


MsmX 




gp: ABO 13 3 74 


AB013374 


Description 












i. 


Bacillus nalodurans 


C-125 


mamX , 


yjdA, ykoK and yvtK genes, 


parti aland 


complete cds . 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

- — Score 
Length 


.■ Probability 


25579510_t3_109 . 


783 


• 2703 


156 


4 


71 99 


0.0018 


Protein name. 










Locus Name 




'.. Acc#, 


myosin- alpha heavy chain, 


masticatory muscle 




pir:S33732 


S33732 


Description J '■ 
















ORF Name ' ' 


NTID 


AAID 


NT 
Length 


AA 

_ — , Score 
Length 


-c 

Probability 


25212750_c2_205 . 


784 


■ 2704 


329 | 


990 . 285 


|7.8e 


-25 ; 


Protein name 










Locus Name 




Acc# 












gp:ATAC006436 


AC006436 



Description ■ 



Arabidopsis tnaliana chromosome 1 1 BAC F13J11 genomic sequence , complete 
sequence ... 



238 



ORF Name 



NTID AAID 



NT AA 

t ~4_i„ x Score 
Length Length — — r^— 



26578557 cl 164 



T7W 



F5~~ 



Probability 
0 . 00042 ~ 



Protein name 



Locus Name 



hypothetical protein 29.1 



pir : S59084 



Acc# 
S59084 



Description 



ORF Name 


NTID " AAID 


NT 
Length 


AA , 
r — ; Score 
Length 


Probability 


268i3135_c2_192 


785 2705 


524 


1575 |1287 | 


|3.7e 


-131 • 


Protein name 








Locus Name • 




Acc# 


alkyl hydroperoxide 


reductase, F52A protein 




pir :D64794 > 


D64794 


Description 














ORF. Name 


■ NTID AAID. 


NT 
Length 


.AA 

„ — . , Score 
Length' 


Probability - 


273427_c3_2;63 


|787 2707 


224 


r 


7 5 583 


1. 5e 


-56 


Protein name • 








Locus Name 




Acc# 














P09548 


Description 














jDEDA PROTEIN (DSG-1 PROTEIN) • 


ORF Name ; 


NTID AAID 


NT 
Length 


AA Scor e 
Length ■ 


Probability 


29337825_t-2_62 , - 


|788 2708 


57 / 


2 


04 |131 | 


|1.2e 


-08 


Protein, name 








Locus Name 




. Acc# : ; 

P24560 










sp.:YPTl_P5EAE 





Description 



HYPOTHETICAL 17. ,0 KD PROTEIN IN PILT.5. 1 REGION 


(ORFL). ■ s . „ " . 


ORF Name 


NT AA 

NTID -AAID — , — , Score Probability 
Length LenyLh ■ ■ - ' 


31252514_c3_23i : 


78.9 2709 . 68 


207 | 85' 0.0054 



Protein name 



Locus Name 



glutathione synthetase 



gp :D88540 



acc# ; 

D88540 



Description , 
Synechocqccus sp . DNA tor glutathione synthetase, complete cds . 



239 



ORF Name 



NTID . AAID 



NT 



AA 



3236b0b c'A 217 



r/9 0 



2710 



T . — . . , Score Probability 
Length Length - •• ■■ — ■ ■ f 

119 



1.7e-07 



Protein name 



Locus Name 



hypothetical protein S11183 0 



pir:S75232 



Acc# ' 
S75232 



Description 



ORF Name 



NTID AAID 



32593750 EI 27 



2711 



NT AA 
Length Length 
365 I 111 01 



Score Probability 
11382 



3 . le-141 



Protein name 

Description 
Rfc^A PROTEIN 



Locus Name 



Acc# 
■ P42438' 



ORF Name . 



NTID 



AAID 



33789285 c2 188 



TTTT 



'NT ■ AA 
Length Length 

493 I 11482 



Score 



11448 



Probability 
13 .2e-148 - 



Protein name 



Locus Name 



UDP-N-acetylmuramate : L-alanine ligase MurC 



gp: AF110 74 0 



Acc# 
AF110740 



Description 



Pseudompnas aeruginosa UDP-N-acetylmuramate : L- alanine ligase MurC(murC) 1 
gene, complete cds.,., ' . ,-' 


ORF Name • . 


NT AA 

NTID AAID . . „ — . , . — . , Score Probability 
• Lenyth Length - — ■ - — — — r- 


34040777_cl_169 , 


793 2713 196 


591 93 . . 0 . 00011 • 



Protein name 



Description 



. Locus Name 



sp : PPDD_ECOLI 



Acc# 
P36647 



PPEPILIN* PEPTIDASE 


-DEPENDENT PPOTEIN D PPECUPSOP - 






ORF Name 


v NT AA 
NTID. AAID — ; — 

Length. Length 


Score 


Probability 


34159412 tl_53 


1794 • 2714 330 993 ■ 


I 497 1 


|l.ye-47 



Protein name 



.Locus Name 



oxidative stress transcriptional regulator 



gp :XOT94336 



Acc# 
U94336 



Description 



Xanthomonas campestris alkyl hydroperoxide reductase subunits C(ahpC) and F 
(ahpF) and oxidative stress, transcriptional regulator (oxyR) t genes, complete 
cds . ■' 



240 



ORF Name 



135276891-13 132 



Protein name 



NT ID AAID 



NT - AA 

t — \, t — Score - 
Length Length — - - 



2715 



ITT 



^5" 



Probability 
10.00042 " 



Locus Name 



hypothetical prptexn 



pir:t)75542 



Acc# . 
D75542 



Description 
ORF Name 



NT ID AAID 



3907311 .13 110 



NT AA 
Length Length 
74 



Score Probability 



TIT 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT •; 


ORF Name 


NT ID 


AAID, 


NT 
Length 


AA . 
,., Length 


Score" 


I Probability 












3928750_c3_242 ' 


797-' 


| 2717 


298 


|897 | 


. 412 . 


1.9e-38 | 



Protein name 



Description 



Locus Name 



sp: YH1R_HAEIN 



Acc# 
P31777 



HYPOTHETICAL, PROTEIN HI 0441 


(ORPJ) 








' ORF Name 


NTID AAID 


NT", • 
Length 


aa • 

— , , . Score 
Length 


Probability . 


3940943 12 59. 


798 


2 718 .. 


118 ... 


357- ; . |93 | 


(0.00012 ■ 


Protein, name 








• Locus Name 


• * ' Acc# V 



Description 



sp : YGFK_ECOLI 



P45580 



HYPOTHETICAL 12.6 


KT> PROTEIN IN PEPP 


-SSR INTERGENIC REGION 


(O109) 


ORF Name . . 


NTID AAID 


NT AA 
— r" — Score 
Length- Length 


Probability 


3947318_13_116 


799 |2719 


276 « 831 11038 


| f |8.9e-105 


Protein name 




Locus Name 


\ .. Acc# 



sp:DAPD_MYCBO 



P56220 



Description 

(THP SUCCIMYLTRANSPERASE) ( TETRAHYDROP 1 COL 1NATE SUCCINYLASE) , 



ORF Name 



NT ID 



AMD 



40I7832_r'2_85 


800 


2720 | 









NT AA 
Length Length 

— 



— - , Score 



5T5~ 



Probability 
|2.0e-S0, 



Protein name 



Locus Name 



DedA tamily protein 



Description 
ORF Name 



4023342 cl 186 



Protein name 



Description 



pir:B752S3 



ACC# 
B75253 



NT ID 



AAID 



2721 



NT . AA : 

Length Length 
1215 



Score 



Probability 
|3.8e-10 " 



Locus Name 



|sp:'YGF& E<J0Li 



Acc# 
P25533 



ORF Name 


NTID- 


^AAID 


NT 
Length 


• AA 
— Score 
Length 


Probability 


402336_t2_89 


802^ ■ 


2722 


. 83 


252 




Protein name • 








Locus Name 


Acc# 


Description 












NO-HIT , ' . " ' " ; 


ORF Name 


NTID 


.AAID : 


NT 

■Length 


AA 

— Score 
Length . 


Probability 


4140443 t2 _ // i 


803 


2723. 


157. 


1 474. 




Protein name -.. 








Locus Name 


Acc# 


Description 












MO-HIT 


ORF Name' 


NTID 


AAID 


""' NT 
Length 


AA 

— , Score 
Length 


*' Probability 


4331430J:1__28 


|804 


2724/ 


313- . 


|942 | 124 




•Protein name 








.. Locus. Name 


Acc# 



sp:RECX_VlUUl 



Q56647 



Description 
REGULATORY PROTEIN REOX 



242. 



ORF Name 


NTID 


AAID 


JM 1 

Length 


AA o ... 
T — L1 Score 
Length 


Probability 


■4332837_c3i25$. 


80S 


2725 ■ 


105 


318 / 143 


6.7e 


-10 


Protein name 










Locus Name 




Acc# 


hypothetical protein s±ll830 


pir:S75232 




S75232 


jjescr lption 
















ORF N^mp 


NTID . 


AAID 


NT 
Length. 


AA 

, - — , Score 
Length • ■ - 


Probability 


4348813_c3_259' 


806. 


2726 


350 


1053 |1035 | 


|1 . 8e 


-104 


Protein name ; 










Locus Name 




Acc# 












sp:GCP_HAEIN 


P43764 


Description 
















■-(GLYCOPROTEASK) 
















ORE ^ Name 


NTID 


AAID ■■■ 


NT ■ 
Length 


AA 

„ - — Score 
Length 


Probability 


4694427 J:3_U4 


807 


2727 . 


252 | 


1089 |884 | 


|1. 9e 


,88 



Protein name 



Locus ..Name 



sp:LIPA_HAtllN 



Acctt 
P44463, 



Description . • - .-'■<,. .' v 

LIPOIC ACID SYNTHETASE (LIP-SYN) (LIPOATE SYNTHASE) 



ORF Name 


NTID 


AAID 


NT • 
Length 


: - AA ■ - ■ , 

j — . Score 
Length 


' Probability 


47732S0_±2_79 


808 


2 728 


79 . | 


|240. 




Protein name 








Locus Name 


Acc# ,, 


Description 












NO-HIT : ••»■ • ........ 


ORF Name 


NTID 


AAID 


NT 
Length 


AA ■ ' 
— : • Score 
Length • • 


;. Probability 


4798536_c2_l^l . 


, soy 


2729 


212 | 


639 [685 | 


|2,3e-67 | 



Protein name 



Locus Name 



alJcyl hydroperoxide ' reductase subunit C 



:AF129406 



Acc# 
AF129406 



Description 



Bacteroides Iragilis alkyl hydroperoxide reductase subunit C (ahpCJ and aikyl 
hydroperoxide reductase subunit F (ahpF) genes, completecds . 



243. 



ORF Name 



NT ID AAID 



NT - AA 
t " t -Score. 
Length Length 



Probability 



4824062_c2_200 810 2730 " 


273 




b22 . 353 


3.4e 


-32 . " 


Protein name ' 






Locus Name 




Acc# 








sp : PSS JUELPY 




Q4826? :O07 
681 


Description 










{ PHOSPHATIDYLSERINE SYNTHASE) , . 


ORF Name •'' NT ID AAID 


NT 

Length 


AA 

- — . Score 
Length 


Probability 


6ii0943_c2_203 811 2731 


435 




1308 |303 | 


|3 . 5e 


-26 


Protein name 






Locus Name 




Acc# 








sp : FTSA_BUCA£> 




051928 


Description ' 












• CELL DIVISION PROTEIN FTSA , ( , 


ORF Name NTID .AAID 


NT 
Length 


AA 

„ — . , Score 
Length 


Probability 


659686 c2 210 |812 2 73 2. • 


278. | 




837 | |485 ] 


|3 ..5e 


-46 


Protein name 






Locus Name 




Acc# 








sp : YGDL_HAE I N 




Q57097.:O05 
009< 


Description- 










HYPOTHETICAL PROTEIN HI Oil 8 . 


ORF Name NTID. AAID 


NT ' 
Length 


AA ' '"' 
- , — . Score 
Length 


Probability 


6754081_c3_235 813 2733 


254 | 




765 ■ |169 


|l.le 


-12 



Protein name 



Locus Name 



hypothetical protein MTH939 



ir:G69225 



ACC# 

G69225 



Description 
ORF Name 



NTID AAID 



NT AA 

— , — ■ . Score 
Length ; Length ; 



^814^3 13 127 



PT3T - 



21T5" 



F1 [ 



Probability 
0.038 ~ 



Protein name 



Locus • Name 



sp:YXEH_BACSU 



ACC# 
P54947 



Description 

HYPOTHETICAL 30.2 KB PROTEIN I N I DH-DEOR I NTEk^NIC REGION 



244 



ORF Name 



NTID 



682641 11 42 



NT : AA 

AAID — ' . Score ^ Probability 

— Length - Length - — - — - . : *r 





T7T5" 



2TTT~ 



[TuTT 



] [ 



|2.2e-05 



Protein name 



Locus' Name 



hypotnetical protein PH0217 



bir:G71244 



Acc# 
G71244 



Description 
ORF Name 



NTID ' AAID ' 



7222187 tl 35 



2736 



NT . AA , 

Length Length 
'245 



Score Probability 
1287 I |3.4e-25 ~~ 



Protein name 



Locus Name 



conserved hypothetical protein ykrA 



pir : C69862 



Acc# 
C69862 



Description 



ORF Name 



NTID AAID 



781563 c2 227 



2737. 



NT ■ AA 
Length Length 
113 ' 



Score Probability 



7.3e-23 



Protein ■ name 



Locus Name 



sp:YPT6__PSEAE 



Acc# 
P24564 



Description 

HYPOTHETICAL 19.5 KD PROTEIN IN P I LT REGION (0RF6)- 



ORF. Name 



NTID 



AAID 



812535 tl 43 



NT ' AA 
Length Length :/ 
77 ■ 



Score ' Probability 



TIT 



Protein name 



Description 



Locus Name 



Acc# 



MO-HIT. 










•■ y _. 


ORF Name 


NTID 


AAID ■ 


NT 
, Length 


AA 

^— , Score 
Length 


Probability ' 


10625252_rl_3 


819 


| 2739 


581 


|1746 |1790 | 


|i.8e-184< 


Protein name 








Locus' Name . 


acc# - 










sp:SYP_HAEIN 


P43830 


Description 








1 ' " r 





245 



ORF Name 



NT ID 



AAID 



NT 



AA 



Length Length 



Score Probability 



2U4y6Ub2 CZ JO 


820 


2740 


AAA 


1ZZ 1 Lz>ZZ 4. be 


- 1 p b 


Protein name 








Locus Name 


ACC# ' 










sp:TRPB_AC±CA 


P16706 


Description '» 












TRYPTOPHAN SYNTHASE 


BETA CHAIN, 








., ORF Name , 


NTID 


AAID 


NT 
. Length 


AA 

L — Score ■ Probability 


2284'7180_cij\L7 


821 


12741 




642 |568 | |5.7e- 


-55 


Protein name 








Locus Name . 


ACC# 



sp : YADGECOLT 



P36879 



Description ' * 
HYPOTHETICAL AHL 1 TRANSPORTER 'ATP- MINDING ' PROTtllN YADC 



ORF Name 



NTID 



AAID 



24642268 cV37' 



WIT 



12742 



NT 
n 



AA 

— ~ '" Score 
Length Length — 



Probab'ility 
7.9e-4 9 



Protein name 



Locus Name 



sp:TftPF_ACICA 



Acc# 
P16923 



Description , 
N- (5/-PH0SPHORIBOSYL) ANTHRANILATE ISOMERASE, (PRAI) 



ORF Name 



NTID 



AAID 



24814125 c 2,3i. 



WIT 



] Em 



NT . AA 
- Length Length 
1285- " 



— ' 1 Score 



351" 



Probability _ 
4.1e-S2 — : 



Protein* name 



Locus Name 



tryptophan synthase alpha chain 



gp:AF107094. 



, Acc# . 
AF107094 



Description 



Rhodobacter sphaeroides thiamine biosynthetic protein (thi ) gene , partial 
cds; and tryptophan synthase alpha chain .. (trpA) gene , complete cds.' 



ORF Name 

pJ0727194^C2_24 

Protein name 
Description 
NO-HIT ' 



NTID 



\WI%~ 



AAID 



NT 



AA 



Score Probability 



2744 




62 1 


189 



Locus Name 



Acc# 



24 6' 



ORF Name 



NT ID AAID 



3941642 c'A 2b 



NT . AA 
Length , , Length 
100 



Score 



JUT 



TIT 



Probability , 
1.6e-18 ~ - 



Protein name 



Locus Name; 



Isp : YA£>G_EC0L± 



Acc# 
P36879 



Description ... ,' 

HYPOTHETICAL ABC TRANSPORTER ATP -MIND TNG PROTEIN VADG 



ORF Name 



NT ID AAID 



4181557 cl 19 



NT AA 
Length / Length 
281 



■ Score 



ED" 



Probability 
2 . 2e-53 : ~ 



Protein name 



Locus Name 



sp ; YQCD ECOLI 



Acc# 
, Q46 9i2 0 



Description " 

HYPOTHETICAL 32 . 5 KD PROTEIN IN 5YD-5DAC INTERGENIC 'REGION 



ORF Name 



NTID AAID 



4426338 c2 26 



NT AA,', 
Length Length 

T£u 



Score Probability 



1.3e-73 



Protein name 



Locus. Name. . 



sp : YADH_ECOLI ' / 



Description ' ' . . , ' - 

HYPOTHETICAL 28 . 5 KD PROTEIN IN. HPT-PAND INTERGENIC REGION 



Acc# 

P36880:P7 5 

657 ; 



ORF Name 



NTID AAID 



5120412 c3 32 



T7UT 



NT AA 
Length Length 
186 " 



Score 



Probability 
l,Se-15 *, 



Protein name 



Locus Name 



cytochrome c5 



gp:AVU94420 



Acc# 
U94420 



Description 



A.zotobacter vinelandii aldehyde dehydrogenase 
cytochrome c5 (cycB) gene, complete cds, and 
xanthinephosphoribosyl transferase -like protein 


(aldh) 
(xrpt) 


gene, 
gene, 


partialcds, 
partial cds. 


NT • ■ AA 

ORF Name NTID AAID, : — , ..: ' — . 

Length Length 


Score 


Probability 


7031312_c3_33 829,. 2749 66 


201 . ; 


129 


7. be- 08 , 



Proteih name 



Locus Name 



sp:YADG_ECOLI 



Acc# 
P36879 



Description ■ 
HYPOTHETICAL ABC TRANSPORTER ATP- BINDING PROTEIN YADG 



■ 24 7 



ORF Name 



: NTID . AAID 



11199410 c3 60 



STCT 



NT' . AA 
Length Length 
\i>92 , I • 12079 



Score Probability 



Protein name 



.Locus Name 



lactotemn binding protein B 



gp:AF043131 



ACC# 
AF043131 



Description 



Moraxel la catarrhal is strain 4223, lactotemn binding protein B (IbpB ) and 
lactoferrin binding protein A (lbpA) genes, completecds; and unknown genes. 


.• ' NT AA 
ORF Name NTID AAID — , , — , Score 

■- :■ . LenyLh Length. 


Probability 


16128933_c2_53 831 :', 2751' 159 480 ,j 


vn | 


9.9e-35 



Protein name 



Locus Name 



apoiipoprotein N-acyl transferase 



gp:AJb , 0385y5 



Acc# 
AF038595 



Description 



Pseudomonas aeruginosa apoiipoprotein N-acyltransterase (cutE)gene, complete 

cds . . • • ' ' , . . 


NT AA 

ORF Name NTID AAID. . .— •, • — , Score 
, ; • • - LenyLh LengLh 


Probability 


19704378_c3_64 * 832' 2752 ■ 505 1821 |1227 


|8.3e-125 



Protein name 



unknown 



Description 



Locus Name 



gp:AF043132 



Acc# 
AFO'43132 



Moraxel la cat ar rhal is s train Q8 lactotemn binding protein B(lbpB) and', 
lactoferrin binding ; protein A (lbpA) genes, completecds;,- and unknown genes . 


ORF Name NTID AAID 




NT AA . 

^— - ■— , Score- Probability 
Length Length — • — • ; — u 


24337826_c3_67 . : |833 ^ 


2.753 


7.18 | 


2157 ■ |278 | |2. 9e-21 



Protein name 



Locus 1 Name 



hypothetical protein K08H10 . 2a .; 



pir:T23512 



Description 



Acc# ■ 

T23512:T24 
613 



248 



ORF Name 



NTID AAID 



34154812 c2 54 



NT "AA 
Lengtlx Length 
198 



- — ■ Score 



TIT 



Probability 
I:6e-80 — " 



Protein name 



Locus Name 



lactoterrin binding protein B 



|gp:AF043133 



Acc# 
AF04 313 3 



Description 



Moraxella catarrhalis strain VH19 lactoterrin binding protein B ( ibpB ) gene , 
complete cds '. ' , 



ORF Name 


NT ID 


AAID 


NT 

Length- 


. AA ' 

, — . , Score 
Length 


Probability . 


35837503J:2_26 


835 


2755 


61 | 




186 




Protein name 










Locus 'Name 


Acc#, 


Description 












NO-HIT i ' _ 


ORF Name . 


-'■ NTID 


AAID 


NT 
Length 


AA 

„■ — «, Score 
Length 


Probability. 


35945257_ci_4& 


, 836 


2755 


67 




204 • 




Protein name 










, . Locus Name 

:i . ( ■ , . j. •••• 


. Acc# - 


Description 














NO -HIT • . ' ■ 


ORF Name 


NTID 


AAID 


NT • 
Length 


: AA' n 

. — , Score 
Length 


Probability:'. 


3906263_c2_5b 


- 837 < 


2757 


1003 




3012 ; 5252 


0.0 • . • 



Protein name 



Locus Name 



lactoterrin binding protein A 



Description 



m 



:AP043131 



. Acc# 
AF043131 



Moraxella catarrhalis strain 4223 lactoterrin binding protein B(lbpB) and 
lactoferrin binding protein A (lbpA) genes, epmpletecds ; and .unknown genes . 



ORF Name 



NTID 



AAID 



^945388 13,30 



NT AA 
Length Length 
1413 



Score , Probability 
TMT^ 11215 I 15 . 9e -124 



Protein name* 



Locus Name 



beta-ketoacyl-ACP synthase I 



|gp:PAtJ70470 



Acc# 
U70470 



Description ; 



Pseudomonas aeruginosa lemA- type sensor kinase/ response regulatorhomolog 
gene, partial cds, beta- hydroxy - AC P- dehydrase (fabA) andbeta-ketoacyl -ACP 
synthase I (fabB) genes, complete cds: 



ORF Name 



NT ID AAID 



NT ■ AA . . 

t — ^ t ' : Score ' Proba bility 
Length , Length — • — — ■■ — : — = — = — JL 



4005250 c3 68 



•875" 



T7T 



\2 .Ie-55 



Protein name 



Locus Name 



ribosomal protein S12 : streptomycin 
resistance protein 



tpir:A42529 



Description 



ORF Name 



Acc# 

B42939:A42 
939 :H64078 



NT ID 



AAID 



NT 



4093757. c3 Ul 



AA 
angt 
1163 5 



■ — ■ , - — , Score Probability 
Length Length — — • — • 

544 . 



[2954 | 



|3.2e-237 



Protein name 



Locus Name 



unknown 



|gp:AF043131 



Acc# 
AF043131 



Description 



Moraxella catarrhalis strain 4223 lactoterrin binding protein B(lbpB) ana 
lactoferrin binding protein A '(IbpA) genes', completecds ; and unknown genes.. 


ORF Name 


NT AA 

NTID AAID : - — , — ■• Score Probability ■ 
LenyLh LenyLh • 


4804632_t2_17 


841 2761 485 1458 572 2, ie-55 



Protein name 



Locus Name 



Isp-: PABBSALTY 



Acc# 
P12680 



Description 

PARA-AMTM0BEN20ATE SYNTHASE COMPONENT 1 , (ADC SYNTHASE) 



ORF Name 



NTID AAID 



1018 cl 12 



NT Y AA 
Length Length 
22$ 



Score , Probability 



4.7e-12 



Protein name ' 

Description . 
Bacteriophage MB78 ORFs p21; pll .5, p26 & p28. 



Locus Name 



gp:BPMB78E>21- 



Acc# 
X87092 



ORF Name 



NTID AAID 



12303577 cl 11" 



1843 



NT AA. 
Length Length 
Tu~5 — 



. Score Probability 



PTE" 



Protein name 
Description 
' [NO-HIT " 



Locus Name 



Acc# 



250 



ORF Name . , 


, . NT ID 


AAID- 


"NTT ' 

. Length 


7\ 7V 

T Score 
Length 


Probability. 


19557893 C2 17 


844 


2764 


126 


' 381 




Protein name- 








Locus Name 


Acc# 


Description 












NO -HIT • • 


ORF Name 


NT ID 


. AAID 


NT 

Length 


AA 

, — , ■ Score 
Length 


Probability 


200383Q5 1 _c2_i5 / 


, 845 


2765 ; 


■ 75 


228. . ; 




Protein name 








Locus, Name 


. Acc# . 


Description 












no-hit r ■ \ - . . • " 


ORF -Name 


"NTID 


AAID 


. NT 
, Length 1 


AA \ „ 
, — . , r Score 


" f -\ 
Probability •' 








Length 




2i34S$5_c.lJL3 


846 


12766 


169 


' l blu 1 \ 




Protein name 








Locus Name 


f Acc# 


Description 












NO-HIT ■ .. ■ ......... .. .... . 


' ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


23470(S25_c3:_19- 


847. 


| 2767 


18 9 


po |287 | 


|5;9e-25 


Protein name 








Locus Name 


\ r ' acc# 



Description " ' 



L10330 



Plasmid RP4 traN gene, complete cds; traO gene, complete cds; klrAgene, 
complete cds . " : /'• / 1 



ORF Name 



NT ID AAID 



NT . AA 
Length Length 
97 



Score Probability 



Protein name 
Description 
P^TTTT 



Locus Name 



Acc# 



251 




ORF. Name 



31555 c3 20 



Protein name 



coat protein 



Description 



NTID ■ AAID 



12759 



. NT AA 
Length Length 
333 I ■ 11002 



Score 



1^5" 



Probability 
2>0e-17 — — 



Locus Name 



tpir:S58142 



Acc# ■ 

S58142 :T42 
283 



ORF Name 



34391286 c3 21 



Protein name 
Description 
NO-HIT , . 



NTID. AAID 



T77TT 



NT 
an 
flTO" 



AA 

~ — Score 
Length Length — 



Locus Name 



Probability 



Acc# 



ORF Name 



34485637 c3 22 



Protein name 
Description 



NTID AAID 

I 



NT AA • 1 . 

^— ,. , — - \ ■ Score 
Length Length ■ — : — — • 



F5T" 



[2T7T~ 



TT7~ 



Locus Name 



Probability ■ 



acc# 



ORF Name 



3507563 c3 23 



Protein name 
Description 
[NO-HIT : r_ 



NTID AAID 

] 



NT ' 



AA 



2772 ■ 



Length Length 



Score 



Locus Name 



Probability 



Acc# 



ORF Name 



1480476-3 c2 16 



Protein name * 



Description 



NTID 



AAID 



F5T" 



T77T 



NT , AA . 
Length Length 
122 . I 1365 



Score ' Probability 



Locus- Name 



Acc# 



2 52 



ORF Name 


NTID 


AAID 


nt ; 
Length 


' AA 

T — • , Score ■ 
Length 


Probability 


6928I68_c2_18 - 


| 854 


| 2774 ' 


3 08 "< 


927 


; ■ 


Protein name 










Locus Name 


ACC# : 


Description 














NO-HIT " ; ... , ' " ' 


ORF Name' 


* 

NT ID 


AAID 


NT 
Length 


• AA 

„ — , , ■ Score 
Length 


Probability 


12625177_ri_2 


1 


1 ^ '7-5 


1148 


447 435 1 


7.0e-41 


Protein name 










Locus. Name 


Acc# 












sp : DKSA_ECOLl 


| P18274 


Description 














pNAK SUPPRESSOR PROTEIN . • 


/■ ORF Name 


NTID 


AAID 


NT 

. Length 


AA . 
— . Score v. 
Length 


Probability 


1424i635_.c2_80 


'<| 856 


277.6 ■ 


130 


; 353 .-, 




Protein name 










• Locus- Name 


' ■ Acc# • ' 


Description 














NO-HIT 














ORF Name, 


NTID 


AAID "" 


NT 
Length 


AA 

— . Score 
Length 


Probability 1 


14876S91_c2^S2 : 


. 857 


| 2777 


| 25S 


771 . |350 | 


|7i'2e--32 ■ . 


Protein name 










Locus Name 


,Acc# 












sp«rJbW!HJ*ACNO 


P17419 



Description ' ■' 

POSSIBLE, EIMBRIAL, ASSEMBLY PROTEIN- EIMC (5ER0GR0UP Hi) 



ORF Name . 



NTID *-. AAID . 



161-81301 t3 63. 



^5" 



T71T 



NT 
n 



Probability ' 



Protein name ; 

Description 
HYPOTHETICAL 27 . 3 



Locus Name 



sp : YGGHjElCOLI 



Acc# ' 
P32049 



KB. PROTEIN IN ANSB-MUTY INTERGENIC REGION (E239) 



253 



ORF Name 



NTID AAID 



, NT 



AA 



Length Length 



Score 



1689U413 cl 77 



12779." 



TUJT 



TOT" 



Protein name 



Locus Name 



sp:Y712_HAE!N 



Probability 
|9.1e-24 

ACC# 
P44836 



Description 

PROBABLE TONE -DEPENDEN T RECEPTOR HI 07 12 PR E CURSOR 



ORF Name 



NTID AAID 



2780' 



NT AA 
Length Length 
^T" 1 11855 



Score Probability 
11417 I ' 16 . le-145 



Protein name 



Description 



Locus Name 



sp:UVRC_PSEFL 



Acc# 
P32966 



EXCINUCLEASE ABC yUBUNIT C 


ORF Name NTID AAID — , 

Length 


• AA „ 
■ , . Score 
Length 


Probability 


22004587_t2_24- 861 2781 / 333 • 


• ■ 1002 436 


5.5e 


-41 


Protein name:.'. ' 




Locus Name 




Ac eft 






sp:YADB ECOLI 


P27305 :P75 

662 •' 


Description • ' 








HYPOTHETICAL 34' . 9 KD PROTEIN IN PCNB-DKSA INTERGENIC; REGION ' : 


ORF Name 1 NT-ID AAID : 1 — 

• ■ '-' -, Length 


AA. ■ ■ 

■ Score 
Length ' 


Probability 


2359.7187_t3_6i 862 2782;.. 769 ■ 


2310 ' 539 


3 : Oe 


-100 


Protein name 




Locus Name 




Acc# 






sp : PRIA_RHORU/ 


P05445 


Description ■.■ . 










PRIMOSOMAL PROTEIN N' (REPLICATION FACTOR Y) \ ■ /■ 


ORF Name NTID AAID , '— -, 1 
• - - Length 


AA .'■ . . 
, — , Score 
Length 


Probability ' 


23865681_t3_58 | |863 2783 | [739 


' 2220 |789 | 


2 2e 


-78 . | 


Protein name 




, Locus Name 




• Acc# ' 






sp:SPOT_HAEIN 


P43811 



Description ■; ' 

( (PPCPP) ASE) ( PENT A- PHOSPHATE GUANOS INE -3 1 -PYROPHOSPHOHYDROLASEJ 



254 



ORF Name 


■ NTID 


AAID 


NT 
Length 


AA 

t — Score 
Length 


Probability 


24412817_c2_92 


|854 


J2784 ■ 


84 


255 | 




Protein name > , 




• ■ 




Locus Name 


Acc# 


Description 


--■ 










NO- HIT 












ORF Name 


NTID 


AAID 


NT 
.Length 


. AA ; „ 
— - - Score 
Length 


Probability 






2785 


60" \ 


183 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT , : . . 


ORF Name 


NTID 


AAID . 


NT 
Length 


AA 

„ — , Score 
Length . •• 


Probability 


2659458SJL1_I4 . 




| 2785. | 


21B ./;, | 


|648 355 | ' 


2.le-.32 


Protein name 








Locus Name 


: Acc# ■ 










sp:YICO_E0OLI 


— - — ^ P3143 2 :P76 


Description 






• ■* • 




720 ' 


HYPOTHETICAL 22.0 


KD . PROTEIN IN RPH-GMK INTERGENIC REGION PRECURSOR ; . [ 


ORF Name 


■NTTD 


AAID 


NT. 
Length 


AA „ 
— — Score 
Length 


Probability 


29304827_C2 94 


1 ^ 


2787 


p | 


• 192. | . 285 


9.1e-24 


Protein name - 








Locus Name . 


■ ' ■ ACC# • . 










sp:Y712_HAEIN 


1 P44836 , 


Description 












PROBABLE TONB -DEPENDENT RECEPTOR HI 0 7 1 2 PRECURSOR " j 


ORF Name 


. NTID 


. AAID 


■ NT 
Length 


AA 

. — . Score 
Length 


Probability 


294844l8_c2_95, 


. 858 . 


2788 


566 


1701 | |710 | 


|l.Se-118 .; 


Protein name 








; Locus Name 


Acc# 


methyl transferase 








. gp:AF050119 


AF060119 


Description . 



Pasteureiia haemolytica metnyltransterase (mod) and restrictionendonuciease 
(res) genes, complete cds . 



255 



ORF Name 



NTID 



AAID 



30281911 53 114- 



NT AA 
x x Score 

Length > Length — — 



Protein name 



Description 



Locus Name 



sp:V'm_HAt:™ 



Probability 
[9.1e-24 

Acc# ' 
P44836 



PROBABLE TONU r 


-DEPENDENT RECEPTOR HI0712 PRECURSOR 




ORF Name 


NT AA 
NTID ' AAID r- , — , Score. 

Length LengLh 


Probability 


|34017010_12_3b 


970. 2790 . |307 J924 | |70S | 


|fl.3e-70 | 



Protein name 



Locus Name 



hypotnetxcal protein Jo 2,4" 31 



pir:F65017 



Acc# 
F65017 



Description 
ORF Name 



NTID AAID 



34407905 c2 92 



871 



NT AA 
Length Length 
1450 I 11383 



— , Score 



Probability 
[1801 1 |1.2e-185 



Protein name 



Locus Name 



L-2 , 4-diaminobutyrate : 2 -ketpglutarate 



|gp:AB001599 



Acc# ^ 
ABO 015 9 9 



Description 



Acinetobacter baumannii DNA £orL-2, 4 -diaminoDutyrate : 2 -Jcetog-l-ut-arate " 
4 -aminotransferase, completecds. _ ; 



. i , ORF Name 



.NTID 



AAID 



aa n ■ : 
t_ -i y — «, Score 
Length Length ; — ^ 



34641550 c3 119 



NT. 
nc 



T2F 



Probability 
I.2e-07 " 



Protein name : : . ■ 

Description 
ATTACHMENT INVASION LOCUS-: PROTEIN' PRECURSOR 



Locus Name 



sp.:AIL_YEREN 



Acc# 
P16454 



ORF Name 
3923818 12 34 



NTID 



AAID 



NT AA 
Length Length 



— • — " . Score Probability . 



873 | [2793 ; [ ]ZZ 



Protein name 
Description 
NO- HIT 



Locus Name 



Acc# 



256 



ORF Name 


NT I D 


AAID ' 


NT 
Length 


AA 
Length 


Score 


Probability 


4306455_cl_65 


874 


|2794 


323 




122 


4 . 9e 


-07 . 


Protein name 










Locus 


Name 




Acc# 












sp:FMPl_PSEAE 


Pl'7838 


Description 




















JT JTvIL V U I\. O \_/-L\. 


^ tr _L J_i J. IN / 


V O J. J. IN 


Pi) 












ORF Name 


; NTID 


AAID 


.. NT 
Length 


AA. 
Length 


Score . 


Probability 
















B7B 


|2795 


| 212 


|, 639 | 


541 


4.1e 


-.52 ■ ■ 



Protein name 



Description 



Locus Name 



sp :KGUA_ECOLI 



Acc# 
P24234 



GUANYLATE. 


KINASE, 


(GMP KINASE) 










ORF . Name 




NTID AAID , 


NT ; 
Length .. 


AA 
Length 


1 Score 


■ Probability . . ; 


4428413_cl 


_7S 


875- 2796 


517 


1554 | 


|2128 | 


|2.Be-220 



Protein name 



Locus Name 



L^2, 4 - diammobutyrate decarboxylase 



[gp:AC ; ( ! L24DD 



Adc# 
D55724 



Description 



Acinetobacter ,;baumannn gene : tor L- 2 , 4 - diamxnobutyratedecarboxylase , 
complete cds. . ' ' 



ORF Name 



NTID AAID 



4538558'. 13 51 



2797, 



NT - AA 
Length , Length 
270 1 



Score Probability ' 



i.6e-59 



.Protein name 



Locus Name , -'■ 



hypothetical protein^ 



AccJ ' 
X74218 



Description 



Pseudomonas putida ruvB,- 


tolQ, . tolR, 


tolA," 


tolB and oprL genes. 


• ORF Name ■ ; • •■ " NTID 


AAID 


NT 
Length 


AA'. 

'- — - Score . Probability 
■ Length — — — , 


4570318_13_44 . |B78 


.12798 


228 


' p7 | 448 | 3.0e-42 


Protein name / 






Locus Name Acc# 



-P44755'. 



Description 
PHOS PHOGL YCOLATE PHOS PHATAS K , (PGP) 



257 



ORF ' Name 



NTID 



AAID 



4902305 c2 % 



T7W 



NT 

n 



AA 

^ . -i Scor e 
Length Length — ■- 



Probability 
|b.3e-226 ' 



Protein name 



Locus Name 



restriction endonuclease 



Acc# 
AF060119 



Description 



Pasteurella naemoiytica methyltranslerase (mod) and restrictionendonuclease 
(res) genes/ complete cds. - 



ORF Name 



NTID 



AAID 



4964000 rJ b7 



12800 



NT AA 
Length Length 
95 I 1258 



Score 



[TFT 



Protein name 



Description 



Locus Name 



sp:kPOZ_HAEIN 



Probability 
|I.3e-14 ' 

Acc# 
\ P43740 



OMEGA- CHAIN) (RNA POLYMERASE .OMEGA SUBUNIT) 



ORF Name 



NTID AAID 



15079408 =J3 ^0 



NT AA 
Length Length 
Ibl • ' 



Score Probability 



|±.Ie--33 



Protein name 



Locus Name 



hypothetical protein 1 (vntA ,5 ' region) 



pir :B44514 



Acc# . 
B44514 



Description ; 



ORF Name 



■NTID AAID 



5080293 c2 .89 



12802 



NT AA 
Length ■ Length 
592 ' " I 11779 



Score 1 Probability 



11142 



|8.5e-116 



Protein name' 



Description . 



Locus Name 



sp:RECJ_HAUlW 



' Acc# 
P45112 



S INGLE - 5 TRANDED - DNA -SPECIFIC EX0NUCLEA5E REG J , 



ORF Name 
|5339762_c3J-09.. 



NTID AAID 1 



NT 
n 

TF5 



AA 

~ ~ Score 
Length Length ■ 

[TiW 



Protein name 



Locus Name 



Probability 

p.0e-i20 ' 

Acc# 



LytB 



AF027189, 



AF027189 



Description 



Acinetobacter sp . BD413 lytB, comB, comC, comE , and comF genes /complete cds; 
and unknown genes. . ". 



ORF Name 


NT ID 


AAID . 


;,nt 

Length 


' AA 
, — . Score 
Length 


Probability 


97b82_13__4^ 


884 


■ 2804 


289 | 


870 291 


1 . 3e 


■ " . 1 


Protein name 










Locus Name 




ACC# 












sp:.ICC_ECOLI 


P36650 

..j 


Description 
















ICC PROTEIN ' 


ORF Name 


NT ID 


• AAID 


NT: 
Length 


AA 

. — , Score 
Length 


Probability 


1367343 7_c2_39, : 




2805 


p7 


17 91 150 


3 . 5e 


-13 • - 


Protein name 










Locus Name 




Acc# 



AF147978 



, AF14 797 8 



Description 



Bacteriophage D3 putative terminate, putative portal protein, putative ClpP 
protease, and major head protein genes, complete' cds ; and unknown genes. 



ORF Name 


NTID 


AAID 


NT 

. Length 


aa' ; 

— — , - Score 
Length; 


Probability 


14181292_cJ_43 


| 885 


2805 ■ 


101 J 


305 i 




Protein name; 








Locus Name. 


" Acc# 


Description 












no-hit . ■ ■ - - ■:■'■= • . . ; 


ORF Name , ■ 


NTID 


AAID ' 


NT ■ 
Length 


AA . ■ ■ 
— , , Score 
•Length ., • 


Probability 


|19^5312b_c2_3^ 


| 887 


. 2.807 


| 21b 


|648 




Protein name. 


■ 1. 






. Locus Name- • 


Acc#. 


Description 












NO-HIT. .. • • . , 1 


ORF Name 


NTID 


AAID 


NT . 
' \ Length 


AA . 

f — , Score 
Length 


. Probability 


2566595?_ci_33 


IF 9 


. 2808 


: |128 


387 .: 88 


0.00-35 ; J 



Protein name 



Locus Name 



1.7 protein 



gp:BPH2S1805- 



Acc# 
AJ251805 



Description , . 
Bacteriophage phi- Ye03-12 complete 



genome . 



259 



• ORF Name . NTID AAID 


NT 
Length 


AA . , 
, — . , . Score 
Length " ■ 


Probability 


2667531i_c2_3'j J |889 | 2809 


|,107 


|324 14b | 


3.8e-lp | 


Protein name 1 




Locus Name 


Acc# . 


hypothetical protein 




gp:XME133022= 


AJ133022 


Description ' " ' 


Xenornabdus nematophilus proviral ORF1 to ORF8. 


ORF Name' NTID AAID 


... NT 
Length 


AA .. • 
— - , Score 
Length 


Probability 




150 | 


453 1209 1 


|6 . 7e-18 | 


Protein name ■ ■ 




Locus Name 


" Acc# 


DNA primase 




• pir:C41830 


C41830. 


Description . 








' ORF ^ Name ' '. NTID AAID 


NT 

Length 


AA 

' " Score 
Length 


. i ■ 
Probability 


3941887_cl_32 .851 , 2811 


185 ■ 


558 '. |88 | 


10.019 J 


Protein name 




Locus Name 


Acc# 






|gp:PFA53C6, ; 


X17490 


Description 








Plasmodium talciparum mRNA tor asparagine-ricn antigen (cloneb3C6>) . 


ORF Name NTID AAID \ 


NT 
Length 


AA 

. — . Score 
Length 


Probability; . 


|591S253_cl_28 892 ... 2812- 


| 257 


[774 | 259 


3.1e-22 ; 


Protein name 




Locus Name, ; 


Acc# 






Sp : Yhl2 2_HAE1N 


P44193. 


Description 








HYPOTHETICAL PROTEIN Hi 14 2 2 


ORF Name " NTID AAID ; 


NT 

Length. 


AA 

— , Score 
Length. 


Probability ■ 


900215 1 _c3_4i ■ | 892 .2813 


797 | 


2394 |330 | < 


|4..7e-27 


Protein name 




Locus' Name 


Acc# 


putative • DNA primase 




gp:AF139719 


AF139719 



Description 



Klebsiella oxytoca plasmid pACMl, putative DNA primase (pri) gene/complete 
c d s ; and • unknown genes. 




ORF Name 


NT 

NTID AAID - — ^ 
Length 


AA 

— , Score 
Length 


Probability 


|109=>9627_t2_14 


" |894 | 2814 294. J 




i.8e-49 


Protein name 




Locus Name 


Acc# 






sp:YBHX_MCOLI' 


P77392 


Description 








rl I sr\J l rUi 1 X OLLi" J J 


.3 kd protein in cute-asnb intergenic region 




ORF Name 


NT 

NTID AAID — . , 
Length 


AA , . 
— , Score 
Length 


. Probability 


|I178127_t3_20 


| |895 2815 | 963 


|2892 |2605| 


|1.9e-28i 



Protein name 



Locus Name 



Sec A. 



|gp:AB012226 



Acc# 
AB012226 



Description 

Vibrio aiginoiyticus gene , tor SecA, complete cds. 



ORF Name 



NTID AAID 



12298468 11 13 



. NT, AA 
Length Length 
102 



Score, 



JUT 



Probability 
10.017 



Protein name 



Locus . Name 



probable membrane protein' L54 9 . 12 



[pir : 



T02800 



' Acc# 
T02800 



Description 



ORF Name 



NTID '. AAID 



12 98 53-93 J: 3_H 



2UT7" 



NT • ... AA 
Length Length 
T7TT 



Score 



"5T7" 



Probability 
l.le.-bl - 



Protein name 



Description 



Locus Name 



sp:PEPD_HAElN 



Acc# . 
P44817 



(PEPTIDASE Dj - 


ORF Name . ' NTID AAID 


NT 
' Length 


' AA 
— : , Score 
Length 


Probability 


14538262 _c2_40, 898 2818 


r 1 


1 |210 |109 | 


|2 . 5e- 


-06 ' 


Protein name ■ 




Locus Name 




Acc# 


nypotnetical; protein APEU4bb 




, |pir :A72741 




A72741 



Description 



261 



ORF Name 


NTID AAID 


"NTT 1 

. Length 


AA 

Length Score 


Probability 


14548260_cl_34 


| |899 | 2319 


244 ■■ 


735 265 | 


7.3e-23 


Protein name. 






Locus Name.. 


Acc# 


hypothetical 


protein D1022.4 , v 




pir:T34190 


T34190 


Description 








. ;• ■ 


ORF Name 


NTID , AAID 


NT 
Length 


AA 

. — Score 
Length 


Probability 


21640900_tl_8 


|900 | |2820 


442 - 


1329 | |1361 | ( 




Protein name 






Locus Name 


• Acc# 








.sp.:SSA_PSEAE 


P48247 


Description , 










(GLUTAMATE-1- 


SEMI ALDEHYDE AMINOTRANSFERASE) 


(GSA-AT) 




ORF Name . 


NTID AAID. 


NT ' 
Length 


AA ' 
T — ■ Score 
Length 


Probability 


24609561_c3_56 


,901. 2821 


" 241 


725 . \ |147 | 


|2.4e-19 


Protein name 






Locus, Name 


Acc# 



Description 



sp:u"P14_ECOLI 



. P3 9179:Q46 

826 " • : 



UNKNOWN PROTEIN 


FROM 2D -PAGE (SPOT 


PRblj 








ORF Name 1 




-NTID AAID 


.," NT 
Length 


AA 
Length ., 


Score 


Probability 


36329806 ; ^rl_2 . 


902;. 2822 


' 80 . : | 


243 - 


60 


0.019 



Protein name 



Locus Name . 



thyroid hormone sultotranst erase, B2 



h?ir:JC588b 



Acc# . 
JC5885 



Description 
ORF Name 



NTID AAID 



4188811 tl 6. 



NT AA • ' ' 

Length Length 
1B9 



S.core Probability 
1310 I IF! 2e-27 . 



Protein name 



Locus Name 



conserved hypothetical protein 



pir :T03b01 



Acc# 
TO 3 501 



■ Description 



262 



ORF Name 


NTID AAID 


NT 
Length 


AA 

— . , Score 
Length 


Probability 


433601U vtiJ7. • 


| 904 2824 


1115 


|348 349 




Protein name' 




1 




Locus Name 


Ace# 










sp:PHNA_ECOLI 


P.16680 


Description . 




J : . \ 








PKNA PROTEIN • ■ 


ORF Name 


NTID. AAID ' 


' NT 
Length 


AA 

. — , , Score 
Length 


Probability 


4689143_t2_15. 


| 905 -2S25-. 


.,175 


525 J 1132 


p.Oe-08 


Protein name 








Locus Name 


Acc# 


apolippprotein N-acyltranst erase 




- 


gp:AP03B59!? 


AF03 8595'' 



Description ' ■ ■ 



Pseudomonas aeruginosa apolipoprotem N-acyltransterase (cutE)gene, complete 



ORF Name 


NTID 


AAID . 


NT , 
Length 


x — , i : Score 
Length ,. 


/Probability 


520b312_t3_21 ; 


906 


| |2B2-fi 


69 | 


pio I- 




Protein name 










Locus Name 


'.. Acc# ' 


Description ; 














MO-HIT 














ORF Name 


NTID 


AAID ■:. 


NT. 
! Length 


AA 

. — , Score 
Length 


Probability 


22382752__cl_ll . 


1907 


| 2927 


117 


354 |100 | 


p:2e-05 


Protein name . 










Locus j Mame 


ACC# 


Hypothetical prot 


em 








pir :T10511 


T10511 


Description 














ORF Name' 


NTID 


AAID - 


' NT 
Length -. 


AA 

, — ; , - ' Score. 
Length. 


■ Probability 


26251376_C1_8 




. 2928 


1 92 1 


24 9 ] 





Protein name .' Locus Name , ; Acc# 

Description - . ' ; 

tho-hm — ; — ~~ — " — , L : : — .'" ■ " • • - - — ■ — " ~ 



263 



ORF Name 



NTID 



AAID 



NT AA 
t "^u.-u t ' — ^ Score 
Length Length — 7 



Probability 



35188942_c3__14 • 9.09 


2829 255 " 


768 |248 | 


|4 :€e 




Protein name 






Locus Name 




' Acc# 


hypothetical protein slriy/l 




pir:S75639 


S75639 . 


Description . .,. • . •- 






- t 






ORF Name NTID 


NT ; 
AAID — 
- Length 


AA 

T — . , Score 
Length 


Probability 


3'5337805_cl_10 910 


2330 185 


558 | |152 | 


|5-9e 


-12 . 


Protein name . 






Locus Name 




Acc# ; 


sultate transporter 


gp:t)8963i 


D8?'63l 


Description 












Arabidopsis than ana mRNA 


tor sultate , transporter, 'complete' 


cds . 




ORF Name-, NTID 


NT 

AAID — ... 
* - . Length 


AA 

— >, Score 
Length . - . 


Probability 


35801416_r'l_l , 911 


,2831 . 255 


7S8 |379 | 


^. le 


" Jb • ^ 


Protein name 






Locus Name 




"Acc# 








sp:RLUA_ECGLI 




P39219 


Description '■• 












(MgUDOUHTDVLATE SYNTHASE) 


(URACIL HYDROLYASK) 








ORF Name * ' 1 NTID 


NT- 
MID ' . — ■-. 

-Length ■ 


AA- • ' 
. — . , Score . 
Length 


Probability 

■ ■•■ ■ j- • . • 


4328403_c2_13 . pi 2 - 


28J2 ; 109 


|530 | , 191 


- 5 . le- 


'-I 


Protein name 






Locus ' Name 




Acc# 


BolA protein' ' 




gp:PFL243174' 




AJ243174. 


Description . .. : 












Pseudomonas tluorescens partial Fumarase C gene 


, bolA gene andORFl 




ORF Name ' . NTID.. 


NT 

AAID 

Length 


AA 

_ — ■ Score 
Length 


Probability 


4572206_cl_9 • | 913 • 


(2833 380, . 


1143 | ; 391 




: | 


Protein name 






Locus Name '. 




ACC# . 


sultate transporter 






gp:A&008782 




AB008782 
> 


Description , ' 












Arabidopsis thaliana mRNA for sultate transporter, complete cds. 



264 



ORF Name. 



16990667 11 7 



Protein name 



hemV protein 



Description 



ORF Name 



19534511 cJ 4b 



Protein name 
Description 
INO-HIT . . 



NTID AAID 



2834 



NT AA I 

Length Length 

SI I' 



Score 



Locus Name. 



pir :S 5444 0 



Probability 
|8:8e-06 ; 

Acc# 
S54440 



NT ID 7 AAID 



NT AA 
Length Length 
7TT 



Score Probability - 



] D 



Locus Name 



Acc# 



ORF Name 



15573425 ci 51 



Protein name 



Description 



NTID AAID 



^— . — — Score Probability 



T8HT 



NT. AA 
Length Length 
150 ' I . [3~5X 



Locus 'Name 



Acc# 



NO-HIT- 



ORF Name 



20408375 c2 34 



. Protein .name 
Description 
M)-H±T ' ' 



NTID 



AAID 



2837 



NT AA 
Length Length 
60 I v ' 1183 



Score 



Locus Name 



Probability 



' Acc# 



ORF Name 



20330 c3 42 



Protein name 



NTID 



AAID 



tut 



2838 



NT AA 
Length Length 
144 - 



Score ' Probability 
W7~ 



uUTTT 



Locus Name 



spiAW^Py^PU 



;Acc# 
P25760 



Description 
ATP SYNTHASE PROTEIN I 



265 



ORF Name 



NTID " .AAID 



20991507 c2 35 



12839 



.'NT ; AA 
Length , Length 
U02 



— - , Score 



Probability 
|8.0e-14 



Protein name 



. Locus Name 



periplasmic zinc transporter ZnuA 



|gp:AF14i971 



Acc# 
AF141971 



Description 



Haemophilus ducreyi HIQ318 homolog gene, partial cds ; oxidoreauctase homolog 
and periplasmic zinc transporter ' 'ZnuA •!( zhuA) genes , complete cds; and 
ribose-5-phosphate isomerase A homologgene, partial cds.; 



ORF Name 



NTID AAID 



22129692 c2 37 



NT, 
Length 
1496 



Length 
11491 . 



Score 



11855 



Probability 
|I.6e-192 



Protein name- 



Locus Name , 



H+- transporting; ATP synthase, .beta chain; 



pir :D64071 



Acc# 
D64G71 



Description 



ORF Name 



NTID 



AAID 



24507777 cl 4 J 



WIT 



NT 
■Length 

— n 



v AA 
Length 

11563 



Score 



11979 



Probability 
II . 7e-204 " 



Protein name 



Description 



Locus Name 



sp : ATPA_ECOLI 



, Acc# . 
P00822 



~ATP~ SYNTHASE ALPHA CHAIN, v . ' : 


ORF Name , NTID AAID 


NT 
Length 


AA 
Length : 


Score 


Probability 


25680186__ci_28 |922 2842 ' 


1 P° • 


4.8:3 


371 


4 . ie-34 | 



Protein name 



Description 



Locus Name 



sp:ATPF_VIBAL 



Acc# 
P12989 



ATP SYNTHASE B 


CHAIN," 












' ORF Name " 


NTID 


AAID 


NT . , 
Length ! 


AA 

— -■ _ . Score 
Length . 


Probability ■ 


34023378_ci_2^ 


923 


2843 


;-29f 


891 ' .745 


' 9 . 9e-74 


Protein name 








Locus Name 


Acc# 



sp:ATP6_EC0LI 



Description . • * 

ATP SYNTHASE A CHAIN., , (PttO'l'fcllN 6) 



P00855:Q47 
708 



ORF Name NTID 


AAID 


NT 
Length 


AA 

„ —,\ Score 
Length 


Probability 


3BI59406_c3_39 ■ - , 924 


" 2844 


65 


198 |65 | 


|0 . 004b 


Protein name 








Locus Name 




Acc# 


extensin nomolog F2401.18 " " 




pir : T01456 




T01456 


Description: 














ORF Name NTID 


AAID* 


NT 
Length 


AA 

— - , Score 
Length 


Probability 


3953587 t'i 23 925 


2845 


174 ' 


525 236 | 


8T6e- 


-20 . 


Protein name 








Locus Name 




Acc# 










sp : ZmjZCOLI 




P32692 :P76 
784 


Description 












ZINC UPTAKE REGULATION PROTEIN (ZINC 


• UPTAKE REGULATOR) t 






ORF Name , ' . ' . NTID 


, AAID 


NT 
Length 


AA 

, - — . , Score 
Length 


Probability 


4i20425_ci_29 926 ■ 


2846 


202 


|6Q9 | 268 | 


3 . 5e 


-23 


Protein' name . 








Locus Name 




Acc# 










sp:ATPE)_VIBAL 




, : P12987 • 


Description 














ATP SYNTHASE DELTA CHAIN, ' • ., 


ORF Name NTID 


AAID 


NT : 
Length 


. AA 

— ,• Score- 
Length- 


Probability 


4332943 jcl_27 . 927 


28.47 ' 


84 - 


255. • 261 


1. 9e 


t-22 


Protein name ' 








'Locus Name 




Acc# 










sp:ATPL_HAEIN 




P4 3 721 


Description ' •' ■ 














(DICYCLOHEXYLCARBODIIMIDE- BINDING PROTEIN) 


ORF Name ' f NTID 


AAID 


NT 
Length 


AA r, 

, Score 

Length - 


Probability 


56333_c3_44 . • 928 


2848 


309 


1930 1 894 


1.6e 


-89 •• 


Protein name - 








Locus Name 




Acc# 










sp :ATPG_KCOLI 




P00837 :P00 
838 


Description 













ATP ■ SYNTHASE GAMMA CHAIN, 



267 



ORF Name 


NT ID AAID 


NT 
Length 


AA 
Length 


Probability 


7056441_c2_38 


|929 | 2849 


139 | 


|417 pOO | 


|1 . 4e 


-26 


Protein name 






Locus Name 




Acc# 








sprATPEJiAEIKf 




P43718 


Description 












ATP SYNTHASE EPSTLON CHAIN, 


ORF Name 


NTID AAID 


NT 
Length 


AA 

, — . , Score 
Length 


Probability 


9928433_t2_17 


| |930 2850 


71 


216 | 






Protein name 






Locus Name 




Acc# 


Description 












WO-HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 
Length 


Probability 


32359bu_c2_28 ( 


| 931 • 28,51 


561 ; 


1686 |2096 | 


|6.8e 


^217 


Protein name 






Locus Name 




Acc# 


urocanase : • / "/" \ 


gp : PSEHUTUO . 




Description . 


M33923 :M28. 
3.62 


Pseudomonas put i da 


urocanase (hutU) 


gene,, complete cds . 






ORF Name ; , 


NTID . AAID 


NT ... 
Length 


AA ' 

- — Score ' 
Length 


Probability 


3940938_c2_30 . 


932 2852 


357, ,\ 


1074. 266 , 


1 ;■ 4e 


-33 


Protein ■ name ..*'•". 






Locus Name., 




Acc# 








, sp : HUTG_s_KLEAE 




P19452 


Description • 












THTSTIDITTE^TILTZATION PROTEIN <3) ( 


FRAGMENT) 








ORF Name 


. NTID AAID 

■ V 


NT 
Length > 


AA 
Length. 


Probability 


39S3-I8I_c3_34- ; 


|933 . 2853 


434 


,1305 |1007 | 


|1 . 7e- 


-101 • 


Protein name 






Locus Name 




Acc# 
AL031866 








|gp: YP102KB 





Description 

Yersinia pestis 102 kbases unstable region: trom. 1 to 119443 . 



ORF Name ' 


NTID 


AAID 


NT 
Length 


AA 

T — L1 Score 
Length 


Probability 


482218i_t3_14 


934 


2854 | 


78 


237 






Protein name 










Locus Name 




Acc# 


Description 
















NO-HIT 








NT 
Length . 


AA ' 

T — . , Score ■ 
Length 


Probability 


789037_c3_33 


935 . 


. 2855 


525 | 


|1578 |1486 | 


|3 . Oe 


-152 


Protein name 










Locus Name 




Acc# • 


histidme ammonia 


-lyase , 


histidase 






pir:A3525i 


A35251 :S39 
381 


Description 














■ ORF Name 


NTID 


AAID 


NT 
Length 


AA 

„ — , , . "Score 
Length 


Probability 


|828942_tl_i 


93.6 


. 2855 . 


. 295 


8 


91 ■ |180 | 


|7.0e- 


-12 


Protein name 










Locus Name 




Acc# 



. Description 



Sp : YYAM_BACSU 



P37511 



HYPOTHETICAL 3.2 


9 KD. PROTEIN IN TETB 


-EXOA, INTERGUNIC REGION 




ORF Name 


NTID .. AAID 


NT 
Length 


AA 

— Score ... 
Length ■ 


Probability 


10978400_c3_163 


937 • 2857 


253 


|762 , 185 | 


• 1 . 7e-14 


Protein name 






Locus Name 


" - Acc# -< 1 ■ 








■ sp:HEM4_PSEAE 


P48246 • ', 


Description 












ORF Name 


NTID AAID 


NT 
Length 


AA j 
— x , Score . 
Length 


. Probability. 


I1259526_c2_127 


,938 28.58 


|614 


[18.45 | |175S | 


j7.!ie-18i | 



Protein ,name 



Locus Name 



Acc# 



sp:YABi_HAEIN - 



Description '. . V| 

HYPOTHETICAL AUG TRANSPORTER ATP -BINT) TWT PROTEIN HI1051 



Q57180:O05 
. 043 



269 



ORF Name 


, NT ID AAID 


• NT 
Length 


— ' Score 
Length 


Probability 


I1267230fi^c2_i40 


||939 2859 


1 110 . 1 


[333 |144 


4.8e-10; 


Protein name 








. Locus Name 


Acc# . 










sp:YQSX_HAEIN 


P44048 


Description 








.... , < ■ . , , 




hypothetical protein hio7go ■■ 


. ORF Name . 


. NT ID : AAID. - 


NT 
Length 


AA • 

— . ■ Score 
Length 


Probability 


13131890 ±3 7i 


940 | |2860 


169 


p 


10 | 147 


2.7e-10 • . | 


Protein name 








Locus' Name 


Acc# 










sp:D5BC_ERWCH 


P39691 


Description 












THIOL: DISULFIDE INTERCHANGE PROTEIN DSBC PRECURSOR : . 


ORF Name 


NT ID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


1484729.0_r3_89 


941 . 2851 


61 


. 186 . 




Protein name 








Locus Name "' 


Acc# ■ 


Description 












no-hit ■■ . • ; . • 1 ' 


ORF Name 


NTID AAID 


NT 

Length 


AA 

, — . , Score . 
Length 


Probability " 


15875832_t3_92 : 


942 | 2862 


7b . 


228;- 




Protein name '"- 








Locus Name 


1 . . Acc# ." 



Description 
[WOr-HIT • 



ORF Name 



16493891 c3 16b 



Protein name 



Description . 



NTID 



AAID 



NT AA ■ ..' " 
— * , — . . Score 
Length Length . 



eh. 



Locus . Name 



Probability 



Acc#, 



[NO-HIT 



270 



ORF Name 


NT ID 


AAID ■ 


"■ NT ' ' 
Length 


AA 

T -—^ .Score 
Length 


Probability . 






| 28.54 " 


193 


292 




Protein name 








Locus Name 


* . Acc# 


Description 






* 






NO-HIT • 1 ' v ' \ 


ORF Name 


NTID 


AAID 


■ NT 
Length 


i AA • ' 
— , , Score 
Length 


Probability 


i95770I_ti_30 


|945 




2.77 | 


|834 | 329 | 


,i.2e-29, 


Protein name 








Locus Name 


Acc# 



Description 



sp : GRPE_HAEIN 



P43732 



: GRPE t PROTEIN - 


ORF Name ... 


NTID 


AAID 1 


NT 
Length 


AA 
Length 


Score 


Probability 


19720930J:3_74' 


346' 


. 


276 


831 


59,7 . 


4.8e-58 



Protein name 

Description . ' 

DIHVDUODIPICOLINATE REDUCTSSTTT 



Locus Name 



sp:t)APB_ECOLI 



\ Acc# 
P04036 1 . 



ORF Name 



21489390 r2 59 



NTID ' ;". AAID 
947 . 



NT. . AA 

- — — . Score 

Length Length — ^ — 



28S7 



Protein name 



Description 



Locus Name 



\ Probability 



Acc# 



no-hit --y * 


■ ORF Name 


NTID : 


AAID ■ 


• NT 
Length 


: AA 
Length 


Score.. 


Probability' 


21689063J:i-_i0 ' 


< 948 . 


2868' 


414 | 


1245 


. 328 - 


1.5e-29 • 



Protein name 



Locus Name 



SrpJ 



gp:AF176.824 



Acc# 
AF176824 



Description 



Synechococcus PCC7942 plasmid pANL O-acetylserine ( thiol )- lyase SrpD(srpD), 
gamma-glutamyltrahspeptidase SrpE (srpE) , alpha-helicalc'oiled-coil protein 
SrpF (srpF) , SrpJ (srpJ) , ATP-bindihg proteinof ABC transporter SrpK (srpK) ',. 
membrane lipoprotein SrpL (srpL),and cytoplasmic membrane protein SrpM (srpM) 
genes, complete cds . - ' 



ORF Name 



NTID 



AAID 



22042177 t'l 6b 



Protein name 



argminosuccinate lyase argH 



Description 



AA 

Score 

; Length Length ■ ' ^ 



NT 



TT8~2~ 



Locus Name 



|pir:C69589- 



Probability 
|2.9e-i22 — ^~ 

V ACC# 
C69589 



ORF Name 



. NTID 



AAID 



2207Uiyitl 5 



2 8 7 0 



NT 
n 



' AA „ 
™ — - Score 
Length Length v — 



TZTTT 



Probability 
|485 | p . ae-45 ~~ 



Protein name 



Locus Name 



cystathionme-gamma- lyase 



|gpT 



AF180145 



Acc# 
AF180145 



Description 



Zymomonas mobilis" GTP-bindmg protein CgpA (cgpA) / 6 OKIJ inner -membrane 
protein yidC (yidC) , hypothetical protein, glutamine -pyruvate aminotransferase 
gltB (gltB), glut ama tie synthasesmall subunit . gltS (gltS) , undecaprenol kinase 
udk ' (udk) hypothetical protein, NADH dehydrogenase, hypothetical 
protein; zml2orf5, hypothetical protein/ .aspartate aminotransferase" , 
A, beta -hydroxys teroid dehydrogenase , phosphomannomutase pmm . " 



ORF Name 



22922082 c2 1^8 



Protein name 
Description 



[NO-HIT 



NTID .. AAID 



2571 



. NT ■ AA 

. Length . Length 
519 



Score 



Locus Name 



Probability 



Acc# 



ORF Name 



234701 c3 147 



/Protein name 



extensm 



Description 



NTID 



AAID " 



,;NT AA 
Length Length ' 



Score 



Probability 
|5.'le-07- ' 



•Locus Name 



pir :S22697 



, Acc# 

S22697:S21 
006 ' 



' ORF Name 



23572127 rl 31 



Protein name 



NTID 



AAID. 



NT AA 
— , — - , Score 
Length Length — 



Probability 



2873 | p& | [1911 | [2307 | |3.0e^239 ~~ 

Locus Name - Acc# 



,sp : t>NAK_t H RATU 



P48205 



Description ., 
DNAK PROTEIN (HUM 1 "SHOCK PROTmfl 70) (HSP70), 



ORF Name 



12390969a fl 77 



NT ID 



AAID 



2874 



' NT ,. AA 

Length . Length 
F^S 1 11401 



Score Probability 



4 . 2e-43 



Protein name 



Locus Name 



rubredoxm- -NAD+ reductase, : hypothetical 
protein hydA 3' -region 



. Acc# 
' C65051 



Description 
ORF Name 



NT ID AAID 



NT AA 

— , — / Score Probability 
Length Length — • '— 



24073762 c3 .lbU 



|5.0e-67 



Protein name 



Locus Name 



AvtA 



|gp-:Alf , 014804 



Acc# 
AF014804 



Description 



Neisseria meningitidis PglB (pglB) / PglC (pglC) , PglD {pglD) , ' andAvtA (avtA) 
genes, complete cds . 



ORF Name 



NTID AAID 



NT ' AA 
'.. — - ^ — , Score Probability 
Length Length — ■ « 



124100465 t2 46 



TIT 



1 .3e-64' 



Protein name 



Locus Name 



intrinsic membrane protein 



|gp:AB000nyu~ 



Acc# . 
AB000100 



Description 



Synechococcus sp . DNA tor intrinsic membrane protein, malK-likeprotein, 
cyanase, complete cds . • ' 



ORF Name 



24391877- ci 109 



NTID 



AAID 



NT AA ■ 

Length Length 
329 



Score 



Probability 
5.1e-93 r 



Protein- name 



Description 



Locus Name 



spiHEM^^ECOLT 



Acc# 

P06983 :P78 
125 





SYNTHASE) (HMBS-J (PRE -UROPORPHYRINOGEN SYNTHASE) 


NT 

ORF Name NTID AAID 

Length 


AA 

. Score 
Length 


Probability 


244I7807_r3_76 | 958 | 2878 | |185 




|1.2e-22 



Protein name 



Locus Name 



Mip 



E 



p:S71704 



Acc# 
S71704 



Description 



273 



ORF Name 



■NTID AAID 



24619003 79 



2ST9~ 



NT AA 
Length Length 
267 



— Score 



Probability 
5.2e-61- " 



Protein name . . ■ . ■ 

Description 
NITRATE TRANSPORT ATP -BINDING PROTEIN NRTC 



Locus Name 



sp:NRTC_SYNY3 



Accl: 
P73450 



ORF Name 



126053250 cl.,112 



Protein name 
Description 



NTID AAID' 



2580 



NT AA 
Length Length 

1268 i mr 



. — Score probability 



Locus. Name . 



Acc# 



ORF Name 



NTID 



AAID 



26053250 c2 .144 



NT AA 
Length Length 

116 



Score Probability 



Protein name. 



Description 



Locus ..Name 



' Acc# 



NO-HIT ' , ; : 


" . ■* NT 
ORF Name ' NTID AAID „ — . - 
• • , ; , Length 


AA ! - 

' — ■ ' Score 
Length 


Probability 


26362680_c2_129 " 962 • , 2882 ■• 296 


891 1480 1 


|1.2e-45 


Protein name ■ 




Locus Name 


Add* 






sp:YJFH_HAE!N 


P44906 


Description 








HYPOTHETICAL TRNA/RRNA METHYLTRANSFERASE' HI086 0, 


. r , • ■ ' NT 
ORF Name NTID AAID - — , , 

Length 


AA 

, Score. 
Length 


Probability < 


2928382^2^40-; |963 2883 323-. | 


|972 8 95 


|1.3e-89 


Protein name 




Locus Name 


•Acc# 


sodium- dependent transporter homolog yocS 




pir:E69902 


E69902 



Description 



274 



ORF Name 



NTID 



AAID 



29339432 c3 1U 



NT , AA 
Length Length 
322 



Score 



Protein name 



Locus Name 



hypothetical protein £>2.755 



pir :G65056 



Probability 
|4.2e-06 " 

Acc# 
G65056 



Description 
ORF Name 



NTID 



AAID 



NT 



AA 



29962837 c3 181 



12885. 



t^™^ to^i-h Score Probability 
Lengcn Lengtn • -• • : : *r 

288 



TTT 



9.9e-I3 



Protein name 



Description 



'Locus Name 



sp:DNAJ_SYNP7 



Acc#, 
P50026 



DNAJ PROTEIN 



ORF Name 



NTID . AAID 



32629186 c2' 139 



. NT AA 
Length Length 
223 



Score Probability 



TTTT 



Protein name 



Descri ption 



Locus. Name 



Acc# 



INO-HIT: 



ORF , Name 



NTID 



AAID 



3307 12 47- 



' NT . AA 
Length Length 
119 



T6TT 



Score : Probability 
, WET 



9..1e-32 



Protein name 

Description ' 
HYPOTHETICAL PROTEIN HI 172 3 



Locus Name 



Acc# 



spiYADRJKAEHfl f. 1 P45344. 



ORF Name 



134492161 c3 168 . 



NTID ■ AAID 

] 



^6"S~ 



NT." AA 
Length Length 
353 I 11062 



Score , Probability 



Protein name '■ 
Description 
[MO-HIT — 



Locus Name 



Acc# 



275; 



ORF Name 



NTID AAID 



394092b c3 1U2 



NT 
n 



AA 

T ' — Score 
Length : Length ~ 



Probability 
|3.7e-44 



Protein name 



Locus Name 



Acc# 
P45524 



Description , •' . ; •• • 

HYPO T HETICAL "3 8. b Kb MOTE IN IN K1FB-PRKB im'ERG E N I C , R E GION 



ORF Name 1 


NTID , 


AAID 


NT 
Length 


AA 

— Score ... 
Length 


Probability 






IZ, O 2? VJ 


lOl 


(485 | |147 | 


|2.3e 


-10 ... -| 


Protein name 










. Locus Name 




Acc# 


hypothetical 


protein Ry016 3 








pir:G70903- 




, G70903 


Description 
















. ORF Name 


NTID 


AAID 


NT • 
Length 


AA 

— ■ Score 
Length ' . • 


Probability . 


4187538_t2_50 


971 | 


2891 


473 . i 


1422 1141 


1 . le 


-115 


Protein name 










Locus Name 




Acc# 












sp:MPL_HAEIN 


P.43 948 


Description' 
















LIGASL!, 
















ORF Name 


NTID, 


AAID " 


■ . . NT 1 
•Length 


AA . . \ ' 
• — Score -. 
Length . 


Probability " 


4328135_rl 13. 


1 p2 | 


2892 


474 


1425 . 1 BI2- 


7.9e.-=81 . • > 



Protein name 



Locus Name 



periplasmic substrate binding protein" . 



gp:AP001333 



■ Acc# • 
AF001333 



Description 



Synechococcus PCC7 94 2 periplasmic substrate binding protein (cynA) / integral,, 
membrane protein (cynB) and ATP-binding. protein (cynD)genes, complete cds . 



ORF Name 



NTID . AAID 



4328955. 13 99 



%• NT 
L.en 
TT2 



— "score 
Length Length - . 



Probability 
1.2e-24 . 



Protein name 

Description 
HYPOTHETICAL -PROTEIN HYP0117 



1 Locus Name 



sp :Y117_HAEDU 



Acc#, 
030825 



2 76 



ORF Name 



NT ID AAID 



4423318 c3 160 



J 



12894 



NT AA 
Length Length 
1526 | 11891 



Score 
12307. 



Probability 
|3.0e-239 — 1 



Protein name- 



Locus Name 



93% identity over 631 ammo acids with E . 



coli 



gp:STYSTMFl : 



Acc# ' . 
AF170176 



Description 



Salmonella typhimurium tragment STMF1 . • ..-=■-- - ■- • . 


ORF Name 


NT 

NT ID AAID .. • — , 
Length 


AA 
Length 


Score 


Probability 


44t4512_ll_32 


975 2895 111 | 


336. 




113 


|9,3e-07 



Protein name 



Description 



Locus Name 



sp.:Y173JIAElN 



Acc# 
P43960 



HYPOTHETICAL 


PROTEIN HI0173 










ORF Name 


NTID AAID 


'•, ' NT 
Length 


: AA' 
Length 


Score 


.Probability 


4$48188_±i_38 


. |976 2896 - 


421 | 


1266 . 


|1342 | 


lb .4e-137 



Protein name 



Description 



Locus Name 



sp:bADA_JKmr 



Acc# 
P29011 



Dr- AMINO ACID DEHYDROGENASE SMALL SUBUM IT ; 


■.. '.. NT 
ORF Name • . NTID • AAID — , 
'* .■■> ■■ Length -v 


AA 

-w- - — . , . Score 
Length 


Probability 


5 11 9 12 7_c 2^15 3 977 2897 6bl 


1956 |1464 | 


|6.4e-150 ■ . 


Protein name ' ;„ 


. Locus Name' 


acc# 1 ■ ; 



Description 1 



sp : YH1^_EC0LI 



HYPOTHETICAL ABC 


TRANSPORTER ATP - 


-BINDING PROTEIN YHES 




ORF Name .." 


NTID AAID 


NT AA ■' 
■ — _ — - , - Score ■/ 
Length Length 


Probability ' 


6727086"_rl_17 


975 2898 


563 1692 | |U79 | 


|6.3e-88 



Protein name 



Locus Name 



Acc# 



putative gamma-g;lutamylcysteine synthetase "1 Igp : PSP243 941 ' ■ — H AJ243 941 



Description . , 

Pseudomonas sp . strain HR199' partial , vanB , Idh, gcs, ehyA and ehyBgenes ! 



277 



ORF Name 



.Protein name 



Description 



DMA J PROTEIN 



NT ID 



AAID 



NT AA 
Length Length 
407 I : 11224 



Score Probability 



[1146 | p.2e-ll> 



Locus Name 



sp:DNAJ_SALTY 



Acc# 
Q60004 



ORF Name 



685467-7 c^ 167, 



" Protein : name 
Description ■ 



IKfO-HIT 



NTID 



AAID 



NT AA 
Length Length 
1462 | 



— •• — Score Probability 



J 



•Locus Name 



Acc# 



ORF Name 



801452 £2 60 



Protein name 



Description 



NO-HIT 



NTID 



AAID' ':■ 



NT ; - AA 
Length Length 
76 



Score 



Locus ' Name 



Probability 



Acc# 



ORF Name 



1884712 cl 115 



Protein name 



NTID 



AAID 



NT AA 
Length ,' Length 
1191- | TTZTZT. 



Score 



Locus - Name 



hypothetical protein PH1246 



pir:A71069 



Probability 

p.oo^i — " 

Acc# 
A71069 



Description 



ORF Name 



915633* cl Hb 



Protein name 

Description 

--»- 

•NO-HIT ■ " 



NTID 



AAID 



NT AA • 
— Score . 
Length Length : — 



7 4 | [225 | 

Locus Name 



Probability 



Acc# 



278 



ORF Name 



NT-ID AAID 



9766888. £1 18. 



NT. AA 
Length Length 
182 



Score 



Probability 
4 . 8e-19 



Protein name 
Description 

HYPOTHETICAL' 17.8 KD PROTEIN IN AL0R2 5-' REGION 



Locus Name 



sp:YA21_PSEAE 



Acc# 
P21482 



ORF Name 


NT ID' AAID 


NT . 
Length 


AA 
Length 


Score 


Probability 


i0329680rti_3\ 


985- ■ 2 905 


544 


163 5 ■ 


. I 587 I 


5 . 5e 


-S7 


Protein name 






Locus Name 




Acc# 








gp : P5H0PRP 




D28119 


Description 


.* 












Pseudomonas aeruginosa oprC gene tor outer ; membrane protein' C, complete cds . 


ORF Name . ' ■ 


• ,NTID AAID 


NT 

Length 


AA . , 
Length 


Score 


Probability 


10625550_c3_128 , - 


; 986 , 2906 


65 


198 , 








Protein name j; ' 






Locus Name 




Acc# 


Description 














NO-HIT ; 


ORF Name 


NT ID' AAID 


V NT 
' Length 


AA ■ 
Length 


Score. 


. Probability 


11017010_c3_i39 . 


> 987 2907 


247 


744 \ 








Protein name 






Locus Name 






Description 














NO-HIT ( 

. "f 














ORF Name * 


-.. NT ID AAID 


NT 
Length 


«■ AA " 
Length 


Score 


Probability • 


12854752_rl_13 ; 


988 2908 


69 


|210 


.184 -1. 


.10. 016.'.; ""■ T 


Protein name 






Locus, Name 




ACC# 


conserved, hypothetical protein 




pir :A72 2 21 




A722 21 



Description 



279 



ORF Name 


NTID 


AAID 


NT 
Length 


' AA ■ 

T — . , Score 
Length- • 


Probability 


13759627 CS IJo 


yyy 




7J-B — ■ 






Protein name , 








Locus Name ■ 


Acc# 


Description 












MO-HIT ' ..... " • 


ORF Name 


NTID 


AAID 


NT 
Length 


AA ' 

■ , Score 
Length 


. Probability 


14633433_t3_5S 


|990 


2510 


122 


■ 359 ; 




Protein name 








Locus Name 


Acc# 


Description 












MO-HIT . ' ■ 


ORF Name 


'■' NTID 


AAID 


NT 
Length 


AA ' 
. — ; , Score 
Length 


Probability . 


15023402^c3_125 . 


|991 


12911' 


887 


|2564 J1410 | 


|6.3e-l<=>0 


Protein name 








Locus Name 


Acc# 



Description 



sp : FTSK^COXBU 



P39920 



CELL DIVISION 


PROTEIN -PTSK 


HOMOLOG 












ORF Name . 


• NTID 


AAID . ■ 


J NT ■ 
Length 


AA 
Length , 


Score 


Probability- 


19537930_ll_i 


|992 


2912 


97 


294 .; 


115, ; 


• . 4.5e-07 



Protein name • 



Locus Name 



hypothetical protein APE0900 



pir:D726S5 



Acc# , 
D72685 



Description 



ORF Name 



NTID AAID 



19812500 11 14 



TOT 



NT AA 

- — • — Score 

Length Length - 

T7¥ 



2~9"2ir 



Probability 
5.8e-220 ; 



Protein name 

Description 
DMA POLYMERASE 1 , [POTTY] 



Locus Name 



, sp:DP01_HAEIM 



, Acc# 
P43 741 



280 



• ORF Name 


NT ID AAID 

i. ,-■>■- - . - 


NT 
Length 


AA 

T — . , Score 
Length 


Probability 


2008433_el_I02; 


994 | |2914 


| 287 | 


U64 




Protein name 






Locus Name 


Acc# 


Description • 










NO -HIT 


ORF Name 


NTID AAID 


NT 
Length 


AA 

„ - — , Score 
Length 


Probability 


20506502_c2_10$ 


995 2915 


62 


189 




Protein' name ' 






Locus Name 


Acc# 


Description 










MO-HIT 


ORF Name , 


NTID AAID 


NT 
. Length 


AA 

. — . ,,■ ■ Score 
Length 


Probability 




| |996 2916. 


|B88- 


2667 ' |2 717 | 


|l.le-282 


Protein name 






Locus Name 


Acc# 


DNA t opo i s ome rase, 






bir:C64119 , 


G64119 


Description . - - . 










ORF Name' 


- NTID ' ' AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability , 


2 13 15 5 0_t 1^9 , .. 


997 • 2917 


ii v * 


.348 153 


' 5,,4e-Il ' 


Protein name > 






Locus Name 


Acc# 


pterin- 4 - alpna -carJDinplamine 
dehydratase : protein s'sl2296 : protein 


SS12296 


pir: 574881 


S74881 






Description - 

. . i ■■ ■ ■ ■ , 










' ORF Name 


■NTID AAID 


NT 
Length 


AA 

„ — Score 
Length 


Probability 


22119.402 ■ 1 t^_77 


|998 2918 


262 


|789 | |491 


8.2e-47 



Protein name . '*■ ■ 

Description 

OCTOPiNE TRANSPORT SYSTEM . PERMEASE PROTEIN OCCM 



Locus Name 



'sp:0CCM AGRTi 



Acc# 
P35115 



ORF Name 



NTID AAID 



234404 c'A ill 



NT AA . 

Length . Length 
TT5 



Score Probability -. 
275 



6.3e-24 



Protein name 

Description . 
METH7LTRAMSPERASE ) 



Locus Name 



sp:kkMA_ECCLI 



Acc# 
P36 99 9 



ORF Name 


NTID 


AAID 


• NT 
Length 


AA ■ 
; Score 
Length 


Probability 


23928130_i3_66 


1000 


2920 


* 3 1 


|192 




Protein -name ! 








Locus Name 


Acc# 


Description " 

■ - . ■ • , 












NO-HIT ,- . " ~ - • « "-. .' . 


ORF Name 


NTID . 


AAID ' . 


NT 

Length 


AA • „ ■ ■, 
t "~ Score 
Length 


Probability ■ , 


24256553^t2_3I- . 


1061 


2921. , . 


751 | 


2256 |2080.| 


p.4e-215 



Protein name 



Locus Name 



DNA topoisomerase IV ~ 



|gp:Ak023570 



Acc# 
AB023570 



Description ... 1 

'Vibrio parahaemoiyticus pare gene tor DNA topoisomerase IV, complete cds . 



ORF Name 



NTID . AAID 



- NT AA . i 

■t^u - t ' 1 Score Probability 

Lengtn Lengtn ■ ■ - ■ 



24545763 cl 91. 



2922 



12 96 ,; 



] [ 



|8.4e-22 



Protein name' 



Locus Name 



hypothetical protein jhp!15 5 



pir:G7IS.41 



, Acc# 
.G71841 



Description 
ORF Name 



NTID 



AAID 



24881313. .12 32 



[TMT 



NT " AA 

Length Length 
7u — 



Score Probability 



Protein name 
Description 

NO-HM; — - 



Locus Name 



Acc# 



282 



ORF Name 



NTID 



AAID 



NT ' . AA , 
■ : — '■ — , Score . Probability 
Length Length , - ■ . — ; dL 



2541/ybO c3 134 



TUT 



1.8e»17- 



Protein name 

Description 
METHYLTRAN5FERA5E) . 



Locus Name 



Acc# 
P36999 



ORF Name 



NTID 



AAID 



NT AA . 

— ■ ■ — ; Score 
Length Length — — 



31405203 c2 115 



TTT" 



TFT 



Protein name 
Description . 
[NO-HIT — " " 



Locus Name 



Probability 



Accf 



ORF Name 



• NT ' AA 

NTID AAID — ; — , Score Probability 
. Length Length — — — — ■. ■ - — 



31485525 13 72 



W23T 



5.0e-40 ' 



Protein name _ 



Description 



Locus Name 



|gp:Ak032934 



Acc# 
AB032934- 



Vibrio" algmolyticus ptsA, 
proteins, complete cds . 



orlC, orlD genes tor PF60 andhypothetical 



ORF Name 



34025462 cl 104 



NTID 

rutrr 



AAID 



NT AA 
Length - Length 
69 



Score Probability 



TUT 



Protein name 
Description . 
(NO-HIT. ~ 



Locus Name 



Acc# 



ORF Name 



NTID 



AAID 



34085012 ci 57 



TuW 



NT ' AA 
Length Length 
138 



Score . Probability 



Protein name 
Description 
[NO-HIT . 



Locus Name 



Acc# 



283. 



ORF Name 



NTID AAID 



3536.0075 FT 2 



TuW 



NT AA 
Length Length 
— : 



Score Probability ' 
|73 J JO.0'32 — : — 



Protein name 



Locus Name 



net protein 



|gp:AF169778 



Acc# 
AF169778 



Description 



HIV-1 isolate G221 trom India net protein (net) gene, partial cds ; and 3' 
long terminal repeat,, partial sequence. 



ORF Name NTID 


AAID 


NT 1 
Length 


. — , Score 
Length 


Probability 


3S110253_12_3S 1010 


|2930 


108 


327 , |75 \ 


| [0,0077 


Protein name 






Locus Name 


.'. Acc# . 


outer surtace protein A 






gpr.BBPWUDII 


■ £68539 


Description// 


B. burgdorferi (PWudll) plasmid OspA gene tor 


outer surtace 


proteinA . ■ 


ORF Name ! ' NTID. 


AAID 


. NT 
Length 


AA 

, - , Score 
Length 


Probability 


p945893_13_71 - -.. 1011 


| p 93 ., 1 " 


.285 


861 1825 ' 




Protein name 






Locus Name 


Acc# -• 








gp:AB032934 


^ AB032934 


Description 










Vibrio alginolyticus ptsA, 
proteins, complete' cds . 


ortC, 


ortD . genes 


lor PF6 0 andnypqthetical 


ORF Name NTID 


AAID . 


, , NT 
Length 


AA • 

— , Score 
Length 


Probability 


4337S63_t2_5i; 1012 - 


2532 


258 • 


807 434 


9.0e-41' 



Protein name 



Description 



Locus Name 



gp:Afe032934 



Acc# • 
AB032934 



V lb r i o " a I g i no lyticus* 4 plsA, orlC, 
proteins, complete cds.. 



orlD genes for PF60 andhypothetical 



' ORF Name 

( i - - - ... . 

| 4691525_t T~T 



NTID AAID 



TTTTT" 



2933 



NT . AA 
Length Length 
78 



Score Probability 



Protein name 

Description 

NO-HIT 



Locus Name 



Acc# 



284 



ORF Name 



NTID 



AAID 



147-35333' c2 110 



TtTTT 



NT 
n 

F7u" 



AA 

t Score 
Length Length — 

11749 



T7TT" 



Probability 
14 : Oe-180 



Protein name 



Description 



Locus Name 



sp:&F3_HAE!£N 



Acc# 
P43928 



PEPTIDE CHAIN 


RELEASE FACTOR 3 (RF- 


3) 








ORF Name 


NTID AAID 


NT 
Length, 


AA 
Length 


Score 


Probability 


|5102193_t^_b4 


| 1015 - |2935 


372 


1119 


|897 | 


|7.8e-90 



Protein name 



Locus Name 



sp:GLMU_HAEIN 



Acc# 
P43889 



Description '•' 
ACETYLGLUCOSAMINE - 1 - PHOSPHATE URIDYLTRANSFERA5E j 



ORF Name 


NTID 


AAID 


•. NT 
Length 


AA 
Length 


Score . 


Probability 












5111013_c3 126 ;• 


1016 


2936 . 


201 


506 






Protein name 








Locus 


Name 


Acc# 


Description. 














NO-HIT 


ORF Name •. 


',- NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability, 


65165i8_rl_25 . 


■ 1017 


2937 


243 


732 


499 


1.2e-47 



Protein name 



Description 



Locus Name 



sp:N0CQ_AGRT5. 



Acc# 
P 35 118 



NOPAL I NE TRANSPORT SYSTEM PERMEASE PROTEIN NOCQ 


NT . , AA '■ 

ORF Name ■ . ■ ■ NTID . AAID — , «— , Score 

Length Length 


Probability 


6820875_cl__100 1018 


2938 345 1038 159 


8.8e-09 , 



Protein name 



Locus Name 



apolipoprotein A- IV precursor 



pir :C40892 



ACC# 
C40892 



Description 



285 



ORF Name 


• NTID 


AAID 


NT 

Length 


AA 

T — ^, Score 
Length 


Probability. 


807692_ci^liV 


1019 


2939 


563 


1592 > |171 | 


|1.2e-08 


Protein. name 










Locus Name 


Acc# 


Trip230 . 


gp:A^007217 


AF007217 


Description - ■ . ' 




Homo sapiens 


Trip230 mRNA, 


complete 


cds . 








ORF Name 


NTID 


AAID 


NT 
Length 


AA 

. — . , Score 
■ Length ... 


Probability 


978b02_t2_52 


1020 


| 2940 


266 


|801 | |430 | 


|2.4e-40 



Protein name 



Description 



Locus Name 



|gp:AB032934 



Acc# 
AB032934 



Vibrio alginolyticus ptsA, ortc, ortD genes tor PF60 andhypothetical 
proteins, complete cds. 



ORF Name 



110056500 11 3 



NTID AAID 
1021 



NT AA 
Length Length 
78 



[2TT 



Score Probability 



|4.7e-12 



Protein name 



Locus Name 



hypothetical protein HI0187 



pir:B64I45 



Acc# 
B64145 ' 



Description 



ORF Name 



NTID 



AAID 



. Score Probability 



110656456 c3 200 



NT AA 
Length Length 

1392 | pi | |7.7e-a3 



Protein name 



Locus Name 



sp:YWBN_BAC5U 



r ACC# 

P39597 



Description . ( '. 
HYPOTHETICAL 45.7' KD PROTEIN " IN EPR-GALK INTERGENIC REGION PRECURSOR 



ORF Name 



NTID 



AAID 



12603450 tl 20 



TTF2T" 



2943 



NT . AA 

Length Length 
558 " I- 11677 



Score 



Probability 
1.6e-128 — 



Protein name 



Locus Name 



sp:PILB_PSEAE 



Acc# 
P22608 



Description 
EIMBRIAL ASSEMBLY PROTEIN PILB 



286 



ORF Name" 



NTID 



AAID 



12978955 t'2 43 



NT 
Length 
1274 



AA 
Length 
1825 



Score Probability 
|509 | |2.5e-59 



Protein name 



Description 



Locus Name 



sp:YH2b_AZ0OT 



1 Acc# - 
P54085 



(0RF5) 


ORF Name '" NTID AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


13679786_13_1I2 \ (1025 294S 


-,iiO, | 


333 


,|10b | 


|1.2e 




Protein name 




. .Locus Name 




Acc# 


hypothetical protein 


gp:BSZ75208 . 




. Z75208 


Description" 












B.subtilis genomic, sequence 89009Dp. , . 


ORF, Name , NTID AAID 


NT 
Length ' 


AA, 
Length 


Score 


Probability 


13712643_c3_187 .1025 2946 


. 215 


648 , 


Pi 


|0. 00040 


Protein name 




Locus Name 




, Acc# ■ 


conserved hypothetical protein " 




pir:B75483 




B75483* 


Description " ' . • 












ORF Name " ,"" NTID AAID 


NT 
Length 


AA 
Length 


Score 


. Probability 


I4i9I915_t-2_69 .1027 .2947 


3S5 


1098 • 


329 


2 . 5e 


-34 . .; 


Protein name - 




Locus ' Name 




. Acc# 


conserved hypothetical' protein ylbK 


pir::H698V4 . 




. H69874 


Description .... 












ORF Name" .. , NTID . AAID 


- NT - 
Length 


AA 
Length 


Score 


.Probability 


14251568_t2_72 1028' 2948 


118 


357 • 


1207 1 


|1.0 ( e 


-16 


'Protein name ' 




Locus Name 




Acc# 


hypothetical protein APE1486-" . , ' 


pir:t'72628 




' F72628 .3 



Description 



2 87 



ORF Name •: 


NTID 


, AAID 


"NTT* 
IN 1 

Length 


AA 

, ■ — . ; Score • 
Length 


Probability 


144921«0_c2_14y 


|1029 


2 94 9 ' 


6 7 ' 


2 04 ' 




Protein name 








' Locus Name 


Acc# 


Description 












no -HIT •. ■• • 


ORF Name • ■ 


NTID 


. AAID 


- NT 
Length 


AA 

, — . , Score 
Length 


Probability 


145i±56i_c3_197 ■ 


1030 


. 2950.: 


. 1319 " 


3550 J3473 | 


|0.0' : " \ 



Protein name 



Locus Name 



phosphoribosyltormylglycinamicline ."■ 
synthase , : f ormylglyci'namide ribonucleotide 
synthetase : phosphor ibosylf ormylglycinamidine 



binS.YECPG 



Acc# • 

D65033 :A31 
862 :A34192 



Description 
ORF Name 



NTID 



AAID 



14578927 12 42 



TuTT" 



2951 



NT 
Length 
T7T~ 



AA 
Length 
' 1816 



Score ' ' Probability 
1294 J |4.8e-39 ~ 



Protein name 



Description 



ORF' Name 



Locus Name 



. |sp:HIS2_AQUAE 



Acc# 
O6 7780 



NTID , 



AAID 



14882750 c2 164' 



TUTT 



2952 ■ 



■ NT 
Length 
1409 



AA 
Length 
11230 



Score Probability 



9.0e-5,7 



Protein name 



Locus Name 



putative membrane- transport protein. 



gp:5CC75A 



Acc# 
AL133220 



Description \ ... 
Streptomyces coelicolor ..cosmid C75A. 



ORF Name 



NTID AAID 



14970637 c3 176 



TUJT 



NT V ■* ; AA • ■ 
Length Length 
237 I 1714 



•Score 



\7WT 



Probability 
|1.3e-78 



Protein name 

Description • 
PROTEIN- Ml. 5) 



Locus Name . 



sp : CLPP__ECOLI 



Acc# 
P19245 



288 



ORF Name • , 


NTID AAID 


IN 1 

, •'. Length 


AA : o 
T — , , Score 
Length 


Probability 


i583i636_c2_158 , 


1034 2954 


127 


384 . 236 


1. 5e 




Protein name 








Locus Name 




Acc# • 


Acritlavin resistance protein D. 




gp:D90fl46 






Description 




' -.-■*■ * 








D90846 :AB0 
0134.0 


E.coii genomic DNA, 


Kohara clone 


#357 (46.. 5-46 . 8 


mm. ) . 








ORF Name • 


NTID AAID 


. NT 
Length 


AA 

^— " Score 
Length 


Probability. 


165842 _c2_157 


1035 2955 


205 


618 332 


5,8e 


-30 


Protein name. " 








Locus Name 




Acc# 










sp:.MOm_RHIME 




P25198 


Description 














NODULATION PROTEIN NOLH PRECURSOR - • 




ORF Name 


NTID . AAID 


NT 

Length .. 


• AA • v.. \Y . 
, — , . Score 
Length 


Probability 


i6687526_±l_10 • 


1036 ' 2955 


448 ■ | 


1347/ |1082 | 


|1 . 9e 


-109 | 


Protein name • ' ' 








. Locus Name 




Acc#' 










sp-ARGAJiCOLI. 






Description 












P082 05 :068 
009:068010 
: 06 8 011: 06 ; 


SYNTHASE)': (AGS) ' 




ORF Name 


. NTID AAID 


NT 
Length 


AA 

— , S.core 
Length 


Probability . 


i9i675-_c.2_J50 


1037 2957 


251 


756 |698 | 


|9.5e 


-69 


Protein name 








Locus Name 




Acc# 


5 1 adenylylsultate APS .reductase 






gp.:AF170343,- , 


■ AF170.343 


Description 












Burkholdena cepacia 5' adenylylsultate APS reductase (cysH) 


gene, 


complete 




cds ; and ATP sulfurylase small subunit (cysD,) 


gene, partial cds 








ORF Name 


NTID . AAID 


NT ' 
Length 1 


AA. ' 
, — . ; Score 
Length 


Probability 


19821942_c3_205 


1038 . 2958 


64 


195 






Protein name 




. i 




Locus Name 




Acc# 



Description 
INO-HIT . " 



289 



: ORF Name 


NTID 


AAID 


NT 
Length 


l f. . 

AA 

, — ■ , Score 
Length 


Probability 


20203302_c3_185 


1039 


2959 


225 


681 . 1141 1 


4.0e 


-08 


Protein name 










Locus Name 




Acc# 


DnrE protein 


gp:PSTl3i716 -~ 


AJ131716' 


Description 






... . 








Pseudomonas stutzeri dnrE 


gene and 

■ i. 


0RF235 (partial) . 








ORF Name 


NTID 


AAID ■ 


NT 
Length 


AA 

, — , Score 
Length . 


Probability 


20E>70385_c2_159~ 


1040 


2960, 


193. 


, 582 1 






Protein name ; 










Locus Name 




Acc# - 


Description 
















NO -HIT , . .: v - 




ORF Name 


NTID 


AAID • 


NT 
Length 


AA 

— . Score 


Probability 








Length 






21494010_c2_174 


1041 


2951 


455 


1368 |1.377 | 


11. le 


-140 ; 


Protein name 










Locus Name 




Acc# 


nitric oxide reductase 


gp:AF002217 - 


AF002217 


Description 






Raistonia eutropha 
complete cds . 


megaplasmid ' pHGl 


"nitric oxide reductase (norB)gene, 




ORF Name . 


NTID 


AAID 


NT 
Length 


AA ' 

' , — . , . Score 
Length 


Probability 


21676712__c.i'il20 ' 


1042 


2962 


130 - 


3?3 . 395; 


1.2e 


-36 


Protein' name 




t 






Locus Name 




Acc# • 


suirate 1 adenylyltransr erase subunit 


CysN 




gp:AF130466 




AF130466 



Description 



■Campylobacter jejuni peptide ' chain release tactor 2 (prlB) gene/partial cds; 
. alpha-2 , 3-sialyltransf erase (cst-I) and sulf ateadenylyltransf erase subunit . 
CysD (cysD) genes,- complete cds; andsulfate adenylyl transferase, subunit CysN 
(cysN) gene, partial cds. , 



2 90 



ORF Name 



NT ID AAID 



22078812 c'A 154 



TuTT" 



NT AA 
Length Length 
727 | 12184 



Score Probability 
11636 I |3.8e-168 " 



Protein name 



Description 



Locus Name 



sp:RECG_ECOLI 



Acc# 

P24230 :P76 
721 • 



ATP-DEPENDENT bNA HELICASE REcg, ■ 


ORF Name 


NTID AAID 


NT . 
Length 


AA 

. — , Score 
Length 


Probability 


2235092S_ci_li5 


1044 2964 


724 - 


2175 2193 


3.6e 


-227 


Protein name 








Locus Name 




Acc# 










sp:FAOB__PSEFR 


P28793 


Description 














! 

ORF Name 


NTID AAID 


NT 

Length 


AA Score 
Length 


Probability . 


23556252_c3_203 -■ 


1045 . 2965 


,600 | 


1803 |709 | 


6.0e 


-85 


Protein name 








Locus Name 




■•Acc#\ 


giutamate synthase 


( terredoxm) homo log ye rD 


. pir:C69794 




C69794 


Description 














ORF. Name 


NTID AAID 


NT 
Length 


AA 

T - — .,, Score 
Length - ■ 


Probability 


23593830_c3_199 


1046 : 2966 


340 • 


11023 . 462 


9 . 7e- 


-44 


Protein name 








Locus Name 




Acc#. 



Description 



lsp:YWBM_BACSU 



P39596 



HYPOTHETICAL 42 


■8. KD -PROTEIN IN EPR-GALK InTERGENIC REGION 




ORF Name 


- .NT 
NTID AAID ' • — ■ , 
Length 


AA --, 
r ~~ ", i Score 
Length 


Probability ' 


23651900_t2_48 \ 


1047 | 2967 328 | 


|987 | |634 | 


|b.8e-62 


Protein name , 




Locus Name 


Acc# 



Description • 
HYPOTHETICAL PROTEIN HI0270 



sp:YOHIJJAEIN 



P44606 



2 91 



ORF Name 



NTID 



AAID 



23862b76 ci 124 



11049 



12968 



NT 
n 



AA 

t — Score 
Length :> Length ^— 



T8T~ 



Probability 
4.5e-I4" 



Protein name 



Locus Name 



probable antibiotic resistance protein mtrC 



Description 



[pir:y42418 



Ac'c# 

S42418 :S40 
252 



ORF Name 


NTID 


AAID 


. NT 
Length, 


AA 

, — . , Score 
Length 




Probability 


24042500_c3_I83 


| |1049 


| 2959 


P 13 1 




942 J608 




|3.3e-b9 


Protein name 










Locus Name 




Acc# 














u 


Q10600 


Description 
















SULFUR YLASE )' 


ORF Name 


NTID 


AAID 


• \ . NT 
Length 


AA 

j "■■ — . , Score 
Length 




Probability 


24302260_c3\;i79 \ 


1050; 


2970 


410 




1233- 1331 




8.6e-136 


Protein name 










Locus Name 




Acc# 


3 -oxoacyl - CoA. thiolase 








gp:AP150572 




AF150672 


Description '. 1 ■ ■■ 


Pseudomonas putida 


3-oxoa 


cyl - CoA 


tniolase UacLA) gene, comple 


tecds . 


ORF Name 


NTID 


AAID 


• NT 
Length 


AA 

— Score, 
Length 




Probability 


25584438_c3_189 v 1 - 


1051 


2971 


143 




432 |174 




|7.6e-12 



Protein name 



Locus .Name 



CeoB 



jp:BOT97042 



Acc# 
U97042 



Description . 

BurJtholdena cepacia CeoA (ceoA). and CeoB . (ceoB) genes, completecds . 



ORF Name 



NTID 



AAID 



25984558 C3 188 



TTTT 



NT 
n 
T3T 



AA 

— Score 
Length Length — . 



PIT 



T7T 



Probability . 
1.8e-0 7 . " 



Protein name 



Locus Name 



acritlavin- resistance protein D (acrD) RP170 



bir:F71727-. 



Acc# 
~| F71727 



Description 



292 



ORF Name 



NTID AAID 



26741bb<!> cl 13b 



11053 



T97T 



NT 
Length 
T9"S 



AA 

T ~^\-x. Score 
Length - — = 

[TOT - 



Probability 
10.0011 



Protein name 



Description 



Locus Name 
sp:HA34_MELC 



Acc# 
Q99074 



HAM34 PROTEIN 


ORF Name 


NTID AAID ■ • . 


NT 
Length 


. AA 
^~ . , Score 
Length 


Probability 


[324b640_c3_190 \ 


[1054 | 2974 


425 


|127y 265 | 


|9.4e 


-20 ' I 


Protein name 








Locus Name 




ACC# 


probable cation 


elllux : system protein ; 


pir:E71874 ' 




E71874 


Description ' ' 














ORF Name 


NTID AAID ■ 


NT 
Length 


AA ' • . 
•■ — . , Score 
Length 


Probability 


3375052_ci_125 ' 


I0S5 297b ; * 


14 4 


435; |177 | 


|3.6e 


-12 


Protein name 








Locus Name 




Acc# 


probable el t lux 


transporter 






pir:H71918 . - ,-- , 


. H71918- 


Description 














ORF Name 


NTID AAID ' 


J . NT ■ 
Length 


AA 

, — , Score 
Length 


Probability • 


|34027092_c2_175 


1.0 S 6 - 2975 . | 


I 74 


222 60 


• 0.025 


Protein name 








Locus Name 




Acc#/ , , 


■tonoplast intrinsic protein- 






gp:AF0.3 7-061- 




AF03 70.61 


Description 














.Zeamays tonoplast intrinsic protein 


(ZmTIPl) 


mRNA, complete easy; 




ORF Name 


NTID ». AAID 


NT 

Length 


— Scor 
Length , 


Probability. .• 


35197126_t3 79 


|1057 . 2977 


179 


|540 . 155 


3 .3e 


-11 


Protein name 








Locus Name 




Acc# 


TatB protein 








gp:ECO5830- 


AJ005830 



Description 
Escherichia < coli tatABCD operon. 



293 

i 



OR F , Name ■ NT ID AAID • ' ^7-, . — ^ Score •: Probabil 

— . .- , rr- — — ■ T.onrrr h T.d-n/rt-Vi : - : ; — 



|35339i3b;_r2_61 t 


| |10b8 


2978 


1 2 "- 1 


795 | |308 | 


|2.9e-3b 


Protein name • 










Locus Name 


• Acc#' 












sp:LEE>3_AERHY 


P45794 


Description 














TYPE 1 4 PREPIL1N-LIKE PROTEIN SPECIFIC LEADER PEPTIDASE, 


ORF Name 


NT ID 


AAID . 


■ NT 
- Length 


AA 

_ — Score 
Length 


Probability 


3i539.5926_tl_6 . 


10b9 


2979. 


I 511 


1536 325 


2.2e-3b ■ 


Protein name 










Locus Name 


'■■ Acc# >. . 


probable helicase ■ 


pir :T40239 


T40239 : 


Description 














ORF Name 


NT ID 


AAID : 


: NT 
Length : 


AA " ' 
T — ; , Score ; 
Length 


Probability 


3504730.8- 1 2^5 6 


1060 


2980 


|105 


318 • . . 




Protein 1 name 










Locus Name 


■v Acc#: 


Description 














NO -HIT • " \ . 


■ ORF Name 


NT I D 


AAID " 


NT 
Length 


AA 

. - — : ' Score . 
Length = 


Probability ; 


3939838__c2_173- 


1061 


2981 


66 


.201 |189 | 


|8.2e-15 


Protein name 










Locus Name . 


' Acc# . 


k 










sp:RL35_PSESY 


1 P52 830 


Description 














50S RIBOSOMAL 


PROTEIN L3 5 












ORF Name 


NT ID 


AAID* - . 


NT 
Length 


AA 

t — , i Score 
Length 


Probability 


3947675_l2_46 


1062 


29U2 


250 . 


1753 | 375 


1.6.e-J4 r 


Protein name ' 










Locus Name 


Acc#. . 












sp : _HUTC_KLEAE ; 


j P123 80 



Description ' 



HI3TIDINE. UTILIZATION REPRESSOR 

{ i ; . • 



294. 



ORF Name 
13954218 c3 177 



NT ID AAID 



TUUT 



NT • . 'AA - 

. Length Length 
441 I ' 11325 



Score 
11252 | 



Probability 
7.6e-140 



Protein name 

Description 
ATP -DEPENDENT CLP 



Locus Name 



sp:CLt>X_HAEIN; 



Acc# 
P44838 



PROTEASE ATP -BINDING SUBUNIT- CL-PX 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA. '' 
- — , Score . . 
Length , 


Probability 


4328428 _cl 119 


| 1064 




2984 


310 


933 911 


2.6e-91 


Protein name . , 










Locus Name 


Acc# 


























sp : CYSD_MYCTU 


Q10599, 1 


Description 














SyLFURYLAS E ) ■ :t 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


4485953_c3._206 


|.1065 




2985 


121 


355- 469 


l:8e-44 y 


Protein name 










Locus Name, 


Acc# 


ribosomal protein, L20, 








• pir.:R5EC2 0 




Description 












l — 1 D64930:S08 
j 608 :A02806: 
:I41282.. 


ORF Name 


. NTID. 


AAID 


' NT 
Length 


AA 1 "-' ' 
, — . , Score ■ 
Length 


Probability 


5120443_t3_94 


| 1055. 




2985 


|214 . 


545 , 353 


3.0e-33 



Protein name 



Description 



Locus Name 



sp:YAGE_VIBVU 



, Acc# 
Q56741 



HYPOTHETICAL 22 


5 "KD PROTEIN IN WPD 


3 1 REGION (ORPX) 






ORF Name 


NTID ' AAID 


NT. AA 
Length Length 


Scorei 


Probability.. 


5890543_t2_73 - 


1057 J2987 


380 1143 


• 131 


7.7e-07 



Protein name ■■■ 



Locus Name 



dnaJ protein homolog 



pir : S34632 



Acc# 
S34632 



Description 



2 95 



ORF Name 



NTID 



AAID 



7277 FT 93 



, NT , AA 

Length Length , 
1423 I •11272 



~~~ Score 



Probability 
3.8e-65 " 



Protein name 



Locus Name 



^pilus assembly protein PUC 



gp:AF03 86 55 



Description 



Acc# . 
AE038655 



Legionella pneumophila piius assembly protein PUB (pilB) , pilusassembly 
protein PilC (pile) ,' and type IV prepilin-Tike proteinspecif ic leader 
peptidase PilD (pilD) genes/- complete cds . 



ORF Name 



NTID • AAID 



AA 

T — ^ — Score 
Length Length 



978bl7 c2 166 



2989 



NT 
n 
TO 



1302 



Probability 
5:.8e-112 



Protein name 



Locus Name 



sp :GLTS_HAE1N 



ACC# 
P45240 



Description 

SOD I UM/ GLU T AMATE SYMPOR T CARR IE R PRO TE IN (GLUTAMATE,- PERM E AS E ) 



, ORF Name 



NTID 



AAID 



10547711 c2 8 



TuTO" 



NT AA 
Length . Length 
T75 — ' 



Score 



Probability 
ll.2e.93 " 



Protein' name 



Description 



Locus Name 



sp:ABC_HAEIM 



Acc# 
P44785 



ATP-BINDING 


PROTEIN ABC 










ORF Name 


NTID, 


' AAID - 


NT 
Length 


AA o ... 
\ — \ , Score 
Length 


Probability 


1S8807_c2_1'0 


1071.; 


2991 . 


118 ; 


354 ■ 320. - 


I.ie-28, 


Protein name 








Locus Name ^ ■■ 


. Acc# 



Description - 



;p : PLPA_PASHA 



Q08868 :Q07 
363 



OUTER MEMBRANE 


LIPOPROTEIN 1 PRECURSOR (PLP1 








ORF Name 


NTID AAID 


NT 
; Length 


AA 
Length 


Score 


Probability 


19589S35_ci_7 


1072 |2 9.92 


|99 


J 300 1 


I 190 1 


|6.5e-15 



Protein name 



Locus Name 



ORF120 



gp : tldORRNHKl-2 



Description 

E.coli genomic DNA, 5 'Hanking region or rrnH gene. 



Acc# 
D15061 



296 



ORF Name 



NT-ID AAID 



24353193 tl I 



TuTT" 



2993 



. NT AA 
Length Length 
95 



Score 



TTT 



Probability 
|9.3e-07 



Protein name 



Locus Name 



hypothetical protein PH013 3 



pir:C71234 ■ 



.. Acc# 
C71234 



Description 



ORF .Name 



NTID AAID 



35582056 c3 11 



1074 



2994 



NT ; AA 
Length Length 
240 



Score 



J7T 



Probability 
1.2e-Sl 



Protein name 



Locus Name 



sp:VAEE__HAEIN 



Acc# ,, 
P46492 



Description '*.",. ; 

HYPOTHETICAL ABC TRANSPORTER PERMEAiJli! PROTEIN H10620.1 



ORF Name 
4144442 Tl 5 



NTID 
1075 



AAID 



NT ■ AA 

Length Length 
131 



Score Probability 



Protein name 



Description 



Locus Name 



-. Acc# 



NO-HIT 


ORF Name 


NTID 


,r 

AAID 


■■■■ *d V 
Length 


AA 
Length 


Score 


' Probability 


ibftoAllt ill 


| 1076 


. 2595 


357 




1104 


; 137 


7.2e-06 



Protein name 



Locus Name 



■i 



p : PSENOSA 



ACC# 
M60 717 



Description 

P.stutzeri NosA protein . (nosA) gene, complete cds. 



ORF Name 



NTID 



AAID 



bl09792 Tl 3 



TUTT 



NT . AA ; 

Length - Length 
122 I . "136 9 



Score Probability 



.Protein name 



Locus Name 



Acc# 



Description 
[NO-HIT , 



ORF Name 


NT ID 


AAID 


NT 
;■' Length 


AA 

; , Score 
Length 


Probability 


16681688_cl_3 


1078 


2998 


SO 


183 , |55 | 


10.044 


Protein name 










Locus Name 




Acc# 












sp : RNH__HELPY 


.P56120 


Description. 
















RIBUJMUCLEASE, H, 


■(RNASE H) 
















ORF Name 


NTID 


AAID 


NT 
Length- 


■ M"- ' ■ . 

— . , Score 
Length 


Probability 


24039143 til 


| 107? 


2999 


, ,i4> | 


|1038 | 637 | 


2.8e 


-62 


Protein name' 










Locus Name 




Acc# 


ornithine decarboxylase 




pir:D72200 ■ 


D72200 


Description. 
















ORF -Name 


NTID ' 


AAID 


• ; NT 
Length. 


AA 

j — r.., . Score " 
Length - 


, Probability 


29337840_t2_2 


1080. 


. 3000 


150 . . 


450 |95 | 


|0 . 014 . | 


Protein name; 






I - 




Locus Name 




Acc# ' 


AvtA , - 










gp:AP014804 


AF014804 



Description 



Neisseria meningitidis PglB tpgxB) , 
genes, complete cds . 



PglC (pglC) , PglD .(pglD) , anOAvtA (avtA) 



QRF. Name 



NTID AAID 



111400461 T2 7 



11081 



\TUUT- 



NT ■ AA 

Length Length 
69 



Score Probability 



Protein name 
Description • 
NO-HIT . ~ 



Locus Name 



Acc# 



ORF Name 



NTID 



AAID 



114573562 rl -2 



Protein name 7 

Description 
(EC 37T7T75TJ 



1 )1082 | [3002 | 



,NT • AA 
Length Length 
1341 | 11026 



. Score Probability 
1107b I |l.le-10.8 " 



Locus Name 



sp : TkMU_ECOLl 



Acc# 

P2 574 5 :P75 
964 



298 



ORF Name 




NTID AAID 


NT 
Length 


AA " 

_ — , Score 
Length - 


Probability 


3027046bJ:l_3 




1083 | p003 


F 49 1 


750 |185 | 


|b.7e-14 | 


Protein name 








Locus Name . 


... Acc# 










sp:YYAD__BACSU 


P37520 


Description . 












- HYPOTHETICAL 3 7 


. 7 


KD PROTlilN " IN RP5F 


-SPO0J INTERGENIC REGION 




ORF Name 




NTID AAID 


NT 
Length 


AA 

. — . , Score 
Length 


Probability 


635593<v_i2_6 - 




|1084 3004 | 


108 


3-27 | |170 | 


|8.be-13 


Protein name 








Locus Name 


Acc# 










' |gp : ECPURB 


X59307 


Description, . 












E.coli .ORF- 15; 


ORF- 2 3-, purB and pnoP 


(5 'end) 


genes. 




ORF Name 




• NTID AAID 


NT 
Length 


AA 

- — ■ , Score . 
Length 


Probability 


|S537957_cl_li 




|108b 3005 


200 


603 454 


6.8e-43 


Protein- name , 








, Locus Name 


Acc# 










sp : YGBB__ECOLI 


P36663 . 


Description 












HYPOTHETICAL 16 


.9 


KD PROTEIN IN 'S.URE 


-CYSC> INTERGENIC REGION (ORFO) r" | 


ORF Name 




NTID AAID 


NT 

Length 


AA 

— Score 
Length 


Probability 


|17,3187_f 3_5 . .. 




1085 3006 | 


m - ■ 


i 2oi i : 




Protein name 








Locus Name 


Acc.# 


Description 












NO -HIT ^ 


ORF Name 




NTID AAID 


NT 
Length 


AA 

_ — . , ' Score 
Length 


Probability 


34407827_t2_2 . 




1087 | 3007 | 


3W | 


1101 | |786 | 


|4.be-78 | 


Protein name 








Locus Name 


Acc# 



Description 
SYNTHETASE) 



|sp : LCFA^ECOLI ■ 



P29212 



2 99 



ORF Name 


NT ID 


" AAID 


"NTT 
V* 1 

Length 


T r^_, Score 
Length 


Probability 


7285152_tl_l 


. 1088 


3008 


" 69 i 


2 07 






Protein name 








Locus Name 




Acc# 


Description .V 














NO-HIT . . , 


ORF Name 


NTID 


AAID 


NT 
Length 


AA ■ . 
, — . , Score 
Length 


Probability 


i6056563_c2_jS3, 


1089 


3009 


. 141 


426 : 90 


' 0.020 


Protein name 








Locus Name 




Acc# 


hypothetical w.ttw protein 






pir:T4I252 




T41252 


Description . 










» 




ORF Name 


NTID 


AAID . 


NT 

Length 


AA 

, — \ , Score 
Length 


Probability . , 


t!)86087_t3j:30 \. 


1090. 


3010- 


412 | 


|1239 |1316 | 


p.le-134 


Protein name 








Locus Name 




Acc# 



Description ' 



sp:5ERA_HAEIN 



P43885 



t)- 3 - PHOSPHOGLYCEkATE DEHYDROGENASE , 


{ PGDH } ■ 








ORF Name"' NTID AAID /• ■> 


NT 
Length ' 


"' AA 
Length , 


Score '* 


Probability. 


2072762_ri_10- 1091 ■ 3011 


1220 


3,663 


. 524 . 


4.4e-93, . 



Protein name 



Locus Name 



chromosome segregation SMC " " r ~ ~> " 
protein iminichromosome stabilizing protein 
SMC 



pir:G69708 . 



• Acc# ■ 

G6 9 7 08: JC4 
819 :PC4 02 9 



Description 
ORF Name 



NTID 



AAID 



21531252 13 36 



1092 



JUTT 



NT . 
Length 
TOT" — "I 



; AA 
Length 
|912 



Score Probability 



|722 | 



|2.7e-7i 



Protein name 



Locus Name 



translation elongation lactor EF-Ts 



pir : .EFECS 



Description , 



; Acc# 

A03525:A45 
269:A32881 
:S45235:B6 



ORF Name, 


NT ID 


AAID 


NT . 
Length 


AA 

.__ score 
Length — — — 


Probability 


23406_11__II 


| |1093 


3013 


| | 2 78 


|837 |898 | 


|6.1e^90 | 


Protein name 






• . • - 


Locus Name. 


Acc# 










sp:kS2_SPIPL 


P34831 


Description 








v 




3 0b RIBUbUMAL 


PKU1E1JM 










. ORF Name 


NT I D 


AAID 


NT 
Length. 


AA 

, — . , Score 
Length 


Probability 


234-37562_rl_7 


1094. 


3014 . 


253 | 


|762 | 186 


3.2e-18 



Protein name 



hypothetical protein HP0862 



Locus. Name 
|pir:F64627 



Acc# 
F64627 



Description 



ORF Name 



NTID .AAID 



NT AA ■ , 
— . ■ • -— " Score Probability 
Length Length — - ■ - ; ~- — tL - 



23613510 c2 70 



3"0T5~ 



T7T* 



1125 



rrmr 



|S.8e-69- 



Protein name 



Locus Name 



sp:YCF0_EC0LI 



Acc# 
P75949 



Description • ? ■ ■ ■ 1 " , 1 . 

HYPOTHETICAL ; 3 7 . .6 KD PROTEIN IN FHUE-NDH INTERGENIC / REGION 



ORF Name 



NTID AAID 



23861685 13-35 



1096 . I nnrr^ 



• NT AA 1 

Length Length 
180 I IS43' .. 



Score 



Probabil'ity 
|1.6e-50 — 



Protein name 



Locus ' Name 



invasion protein homo log;. 



:AFI16285 



Acc# 
AF116285 



Description 



Pseudomonas aeruginosa invasion protein homolog . " . \ • , 1 - ' ■■ 1 

andphosphoenolpyruvate -protein phosphotransferase PtsP , (ptsP) -genesy complete 

cds . ; " ! . ' - . • ' • 



ORF Name 



NTID 



AAID 



2460181 11 1 



TuTT 



NT 



AA 

— . , Score 
Length Length ~ — 



Probability 



rarer 



Protein name 

■ . :■ ' . i, • 

Description 

BETA CHAIN) (RNA POLYMERASE BETA SUBUMiT) 



Locus Name; 



|sp-:RPOB_P5EP17 



, Acc# 
P19175 



3 01 



ORF Name 



NTID AAID 



■ AA 

- t ■ • r — Score 

Length Length . ■. .- 



11099 



NT 

eric 



TTT 



Probability 
3. 3 e- 34-. " 



Protein name 



Locus Name 



|gp:t>AU89892 



Acc# , ■ 
U89892 



Description , • ■ ' * 1 . . 

Pseudqmonas aeruginosa virulence 1 actor regulator (vtr) " gene , partial eels . 



ORF Name 


NTID ' AAID , ! 


• NT 
Length 


AA 

r — . i Score 
Length 


Probability 


26064760_c3_93 


1099 | 3019 


"... | 


poi. , 




Protein name ' 






Locus Name 


Acc# 


Description 










NO -HIT '•' r " . • . . - . 


ORF Name 


NTID AAID • 1 


. NT 
Length 


AA • • 
— , Score 
Length 


Probability 1 - 


|298907pl_t3jji 


|il00 . 3020 


469- ■ . 


1410 |l£78 | 


|1.3e-172 . | 


Protein name 






Locus - Name 


Acc#. 








• sp:GSHR HAE1N 


P43783 


Description 










GLUTATHIONE REDUCTASE , (GR) (GRASE) 


ORF " Name 


,'NTID .AAID 


' NT 
Length 


AA 

T — Score 
Length 


■ Probability 


|32287567_12J-5. 


|'|1101 | 3021 


I 1422 ! 1 


(425 9 | , |4932 • 


,0.0 ■ | 


Protein name 






Locus Name 


, Acc# 


99% identity oyer 
coli 


14 0 7 amino acids- 

v. , 

r 


witn;. E . 


... Igpi^TVyTMl-'l 


, AF170176 










Description 










Salmonella typhimurium fragment STMF1 . 






ORF Name 


NTID AAID* . 


■NT 
Length 


AA 

. — . , Score 
Length . 


Probability 


390716<S_il_£ • 


1102 3022 


3:09 


[930 , | . 117 | 


1.6e-06 


Protein, name 






Locus Name . 


Acc# , 


putative biotin protein ligase ■ 




|gp.:AP01646-l 


I AF016461 



Description 



Bordetella pertussis putative biotin protein ligase (birA) gene , complete cds 
and Bvg accessory factor (baf) gehe> partial cds. 



302 



OW Name 


' NT ID 


AAID 


Length 


AA 

t ~^t_ Score 
Length — ; ' 


Probability r 


4507138_11_5 


| 1103 


| 


727 


pi84 | 736 | 


2.5e 


-88. 


Protein name . 








Locus Name 




. Acc# 










sp:tRC__ECOLI 




P23865 


Description 














PROTEIN) • ; 


ORF Name 


NTID 


AAID 


NT 

Length 


AA 

T — Score 
Length - ■■ 


Probability 




| 1104 


| |3024 


83 


2b2 | 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT 


ORF Name 


NTID 


, AAID 


NT 

Length 


■ 1 ' AA 
■ — — ■ Score 
Length 


Probability 


5ii9452_c3_42 


. 1105 


3.025 


267 


. 804 | 358 


5.9e 


-37 


Protein . name 








Locus Name , 




Acc# - 


: '» 








Sp: Y902_HAEIN 




P44070 


Description 














HYPOTHETICAL PROTEIN HI0902 


ORF Name • 


NTID 


AAID 


NT ■ 
Length 


• AA ' *''•• 
- — . , : : .Score 
. Length 


Probability 


Ii9.5.525.1_c2_26 


, 1106 


| 3025 


915 


2748 '273.5 


1.3e 


-2 84 


Protein name 








Locus Name 




Acc# 










l ; sp:SYA_ECOLI 

'i 




P009S7 :P78 

. 2 7 9 ., 


Description 












ALANYL - TrNA SYNTHETASE,. 


(ALANINE - 


-TRNA LlGASE) (ALARS) : 






ORF Name 


. NTID 


AAID 


NT 

Length 


AA Score 
Length 


Probability 


16b2J292 : _C^_2b 


1107 


3027 


] 1488 


1467 1534 | 


2.4e 


-157" 


Protein name 




j. 




Locus Name. 




Acc# 










sp:PU£8_HAElN 




P44797-- 


Description 














ADENYLOSUCCINATE 


LYASE , 


(ADENYLOSUCCINATE): 


(A5L) • 







303 



ORF Name 


NTID 


AAID 


NT 
Length 


AA . 
Length 


Score 


Probability 




llUt) 




73 


V 222 • 






Protein name ' 








" Locus Name 


Acc# " 


Description 














NO-HIT ; ; • 


ORF Name 


NTID 


.AAID . 


NT 
Length 


AA 
Length 


Score 


Probability 












24308500_ci_22 ■ 


1109 


3029 


155 


458 


282 


1.2e-24- 



Protein name 



Locus Name 



erythroid ditterentiation-reiated tactor 2 



AF040248 



••; Acc# . 

'■ AF040248 



Description 



Homo sapiens erythroid ditterentiation-reiated tactor 2 mRNA, partial cds,. 


NT AA 

ORF Name NTID AAID , 

. Length Length 


Score 


' Probability 


31444625^c3_34 - 1110 3030 452- 1359 


646 


3 .le-63 ' 



Protein name 



Description 



Locus, Name 



sprYCLFJBACSU 



, ACC# 
P94408 , 



HYPOTHETICAL 53.3 KD PROTEIN IN 5FP-0E&KA IHTEHGENJLC .ttUGION 


: NT ' AA „ ■ ■ 
. ORF Name NTID AAID . , — ■ ■'■ — . , Score 

Length Leny Uh 


.Probability 


33632752_c2_27 


1111 • 3031 - 285 . 


855 |679 


p.8e-67 v 



Protein name 



aspartate kinase, II ' ~~' ~ 

precursor : lysine- sensitive aspartokinase II 



Description 



Locus Name 



pir.: A4 8 94 6 



Acc# 



A48946 :B48 
946 :C48946 



ORF Name 



33829043 c3 33 



. Protein name 



NTID . AAID 



11112 



JUJT 



NT AA 
Length Length 
TuT"~ — 1 



Score Probability 



Locus Name 



Acc# 



Description 



NO-HIT 



304 



ORF' Name 
4400293 cl 23 



Protein name 



Description 



NO-HIT" 



. NT ID AAID 



TTTT 



■JUJT 



NT *. AA 
Length Length 
95 



Score 



7WT 



Locus Name 



Probability 



. Acc# 



ORF Name 



15852262 cl 21 



Protein name 



Description 



(NO-HIT 



NTID AAID' 



TTTT 



NT • - AA 
Length i Length 
f£T 



Score Probability 



Locus' Name 



Acc# 



ORF Name . 


NTID 


AAID'. 


NT 

Length 


AA ■' , »■■ 
, — . , ■ Score 
Length 


Probability . 


9933552_tl_3 


' 111b 


| P 035 


32 


249 |, 




Protein name . " : 








Locus Name 


Acc# . 


Description ' 












NO-HIT - ; 1 : 


,ORF Name 


NTID 


AAID 


NT 
Length 


AA . 

— ; , Score 
Length 


Probability 


9975052_c2 L 24 " 


1116 




132 . 


399 | 




(•■'■■ 

Protein name 








Locus ' Name 


Acc# 


Description 












NO-HIT . , • ■ ' 


ORF Name 


. NTID 


AAID 


NT 
Length 1 


AA 

— • Score 
Length - 


Probability 


1036637_c3_j76 


1117 


. 3037 


.61 


186 • 




Protein name 








. • Locus Name 


Acc# 



Description 



INO-HIT 



305 



ORF Name 



NTID 



AAID 



1103888b ti 26 



TITS" 



TOTS" 



, NT AA 
Length . Length 
1416- | 11251 



Score 



Probability 
1.7e-156 



Protein name 



Locus Name 



lactate dehydrogenase 



gp:NMU5891I 



Acc# 
U58911 



Description 



•Neisseria meningitidis lactate dehydrogenase UldA) , 
complete cds, HI1054 homolog gene, partial ^cds . 


HI0379 


homologgenes , 


NT i ■ AA 

ORF Name -NTID AAID — • • — 

'■ Length Length 


Score 


Probability 


10547558_cl_I80. 1119. 3039 , 176 531 


123 ■ 


8.ie-08 ] 



Protein name 



Locus Name 



hypothetical protein APE1165 



pir:H72586 



Acc# 
H72 586 



Description 



ORF Name 



NTID 



AAID 



111711 c3 26b 



1120 



3040 



NT,. AA 
Length Length 
434 I. 11305 



Score 



Probability 
1.3e-46 " 



Protein name 



Locus Name 



HisX 



gp:AF010189 



Accf 
AF01018 9 



Description 



Pseudomonas stutzeri Hll.C (htlC) gene, partial cds; 
complete . cds ■ and r PurA (purA) gene , partial cds . 


HisX (hisX)gene, 


. NT AA 
. ORF Name NTID 'AAID • , " ;. — , 

... - Length Length 


Score 


Probability 


11991552_t3_108 ... 1121 3041 286 ; 861 


. 614 


. 7.6e-60, ■ 



Protein name ; ■ 

Description . . 
INDOLE - 3 - GLYCEROL PHOSPHATE SYNTHASE, (IGPS) 



Locus Name 



sp:TPPC_PSEPU 



Acc# • 
P20578 



ORF. Name 



NTID ' 



AAID 



112222077 F2 53 



TTZT 



TttT 



NT - AA 
Length Length 



Score Probability ^ 
|463 | | i.8e-44 — 



Protein name 

■ ! 

Description 
RECOMBINATION PROTEIN. RECR 



Locus Name 



sp:RECR_HAEIN 



Acc# 
P44712 



306 



ORF Name 



NTID 



AAID 



AA 

~ r ^ r - Score . 
Length Length , ~ 



135202 cl 167 



TTZT 



TOW 



NT 

n 



Probability 
10.0012 ' 



Protein name 



Locus Name 



hypothetical protein S111675 



pir:S74549 



Acctt 
S74649 



Description 



ORF Name 


NTID 


AAID 


Ttf T 
JN 1 

Length 


_ — . . Score 
Length 


Probability 


13678763. Cl_18 2 


1124. 


3044 


223 


I 672 1 P 4 1 


|2 „ 4e - 15 


Protein name 








Locus Name 


Acc#. 


hypothetical prot 


sin RP4 71 






pir :D71706 • 


D71706 


Description 












ORF Name 


NTID 


AAID 


• NT 
Length . 


AA ■ 
T ^. i Score'-' 
Length 


Probability 


14094452_c3_257 


1125 ■ 


3045 ' 


60 | ' 


183 ■ 




Protein name 








Locus Name " 


Acc# 


Description. 












NO -HIT -v . ; • ■ • . : . 


ORF Name ' r 


NTID 


AAID 


.NT 
Length 


AA : •■ 
v : — , , .Score " 
Length 


' Probability . 


i4i03402__il_7 


•ilN 


3045 


285 ■■ 


851 255 , 


7.3e-23 



Protein name 



Locus Name 



sp : YPUG_BAC3U 



Acc# 
P351.54 



Description ' - . , . 

HYPOTHETICAL 29.5 'KB PROTEIN iNvftiBT-PACB' INTERMENT C REGION (0kFX7) 



ORF Name ' 



14273585 12 57 



NTID . . AAID 
1127 



TOTT 



NT 
Length 
3"5T" 1 



AA 
Length 



Score 



Probability 
1508 | |1.4e-154 ■ ' 



Protein name 

Description 
CARBOXYLASE , ) (ACC). 



Locus Name 



sp:ACCC_PSEAE 



Acc# . 
P37798 



307 



ORF Name,: NT ID AAID 


NT 
Length 


AA 

T — \ , Score 
Length 


Probability 




14572132jt3_13S ■■ 1128 1048 


238' 


717 |736 | 


|U .9e-73 ; 




Protein name 




' Locus Name 


Acc# 








sp:lM«_HAHIK 


P44319 


Description 










LYASE) , ; ' ■ 


ORF Name NTID AAID 


NT 
Length 


AA 

; — . Score 
Length 


, Probability 




14875305_t3_113 | (1129. 3049 


| 139 


1420 | 141 


|1.2e-09 




Protein name 




Locus Name 


. ,Acc# 




Ribonuc lease D (EC 3:1 .13 : - ) 




|gp:D90S25 






Description 






— J D90825 

01340 


:AB0 


E'.coli genomic DNA, Kohara clone #334 (40 .. 6 -41 . 0 mm. ) . 


ORF Name NTID AAID ' 


NT 

■ Length 


AA 

„ . — . i Score 
Length 


. Probability 




i5507827_c3_255 1130. 3050 


519. 


1550 |1030 | 


|6.3e-i04 




Protein name • . 1 




Locus Name 


■ Acc# 








■ ' sp ; WADM_Pi>EAE 


J Q51363 


Q51 


Description . 






..412 


is- ASPARTATE- OXIDASE, (QUiWOLINATE 


SYNTHETASE 


B) ' 






••. ■ , ■ ,-ti, • . 

ORF Name NTID AAID ' 


NT.-' 
Length 


AA 

t i Score 
Length 


Probability 




15510254 cl 154 : < 113i 3051 


3 0B | 


927 J1063 | 


|2.0e-107 




.Protein name 1 




1 Locus Name 


Acc# 




i . 'i 




sp;kEl_EOTLI 


— 1 P07011 


P77 


Description - 






340 


PEPTIDE CHAIN RELEASE FACTOR 1 (RF 


-1) 









ORF Name 


NTID AAID 


NT 

Length 


AA 

T . — , , Score 
Length 


Probability ... 


ISS29177_cl_I72 


1132 | 3052 j 


95 


2ua p | 


10.041 


Protein name 








Locus Name 




Acc# . 


F1N21 . 17 ■ - 




gp:AC002130, 


AC002130 


Description 












The sequence ot 


BAC F1N21 trom Arabiciopsis thai i ana chromosome 1, complete 


sequence. 














ORF Name 


NTID AAID 


NT 

Length 


AA 

. — . • Score , 
Length 


Probability 


I95S2800_:t2_52 


1133 |3053 


I 120 1 . 


353 273 


1. Oe 




Protein name 








Locus Name 




Acc# 










sp : YBAB_HAEIN 


! P44711 


Description 














HYPOTHETICAL PROTEIN HI0442 . 


ORF Name ' 


NTID AAID 


nt : 

Length 


AA 

-j- — . , Score 
Length 


Probability 


2l)-3i2-55I_±l_fl- 


1134 - ' 3054 


202 


60? 24.3 


1 . 6e 


-20 


Protein name 








Locus Name • 




Acc# ' 










sp : YPUH^BAGSU 


P35155 


Description 














.HYPOTHETICAL 22 


. U KD" .PROTEIN IN RIBT 


-DACB INTERGENIC REGION 


(0rFX8 ) 


ORF Name 


NTID .AAID 


NT 
Length . 


AA ' . 
, — - , Score 


■ Probability 






Length ■ , .- 






|2..Q573252_c2_229 


| llih' | |3055 


205 


|618 305 | 




-21 | 


Protein name 








Locus Name 


j- 


Acc# ,' 






\ ■ ■ . ■ 




sp:YDJA ECOLI , 


P24250 


Description 














HYPOTHETICAL 20 


l'KD PROTEIN IN SELD 


-SPPA INTERGENIC REGION " 


(ORF183) 


ORF' Name 


NTID AAID 


NT AA 
T — ^ t — , Score 
Length Length 


Probability 


21667027^t2_89 


1136 3056. 


832 


2499 |2093 | 


U..4e- 


-216 | 



Protein name 

Description 
ATP - DEPENDENT PROTEASE LA, 



Locus Name 



sp : L0M_EPWAM 



Acc# 
P46067 



309 



ORF Name v 


, NT ID AAID 


NT-' 

Length 

—> 


AA • 
; Score 
Length 


Probability 


2I581503_tl_9 


| 1137 3057 


329 


. 990 |758 | 


|4.2e 


1 


Protein name 








Locus Name 




ACC# 










sp:YCIL_HAEiM 




P45104 


Description 














HYPOTHETICAL PROTEIN HI 11 99 " . ; , , 




ORF Name 


NT ID AAID 


NT 
Length 


AA 

f — . , • ■ Score 
Length 


Probability 




|1138 |305S 


197 | 


|b94 | 284 | 


7.1e 


-25 


Protein name 








Locus 'Name 




Acc# 










sp:YBEY_ECOLI 




P77385 


Description 














HYPOTHETICAL 17 


5 KD PROTEIN IN CUTE- AS NB INTERGENIC REGION 








ORF Name 


NTID , AAID ' 


V NT 
Length 


AA 
Length 


Probability 


22128380_il>l 


1139 3059. 


• 87 


254 . 






Protein name ■ 




. i 




Locus Name 




Acc# . 


Description 














NO-HIT . , : . . : ; ,; - - 




ORF. Name , , 


NTID i AAID ..' 


NT 
Length 


AA 

Length . ; — \. 


Probability 


223{U635_tlJJl 


, 1140 3050 


190 


573 • |504 | 


|8.7e 


-59 7 


Protein name 








' Locus Name 




Acc# , 


HemO 






|gp:AF133695 




AF133695 


Description 














Neisseria meningitidis HemO (nemO) 


gene, complete cds; ; and HmbR(hmbR) gene, 




partial cds . 
















ORF Name 


NTID AAID 


NT 

Length 


AA 

* — ^ Score 
Length 


Probability 


23444377_cl_199. 


1141 . 3061 . 


492 


. 1479 12364 1 


|2.7e 


-245 . ' 


Protein name 








Locus: Name 




Acc# 


outer .membrane protein E ■, 


gp:MBOOMPh; . 


L31788 


Description , 







Moraxella" catarrhalis outer membrane protein E gene, complete cds.' 



NT AA 

ORF Name • ' NTID AAID - — ' - ' - — ■ Score Probabil 

• ,■ -^-^ — - -• .• Length . Length - — — ■ 



|234700_c3_268 


1142 3062 . 344 


. 1035 , |1002 | 


|5.8e 


-101 


Protein name 






Locus Name 




Acc# 


unknown 






gp:At'10913i 




. AF109131 


Description 




SinornizoJDium, meliloti nomogentisate aioxygenase (hmgA) 

andmaleylacetoacetate isomerase (maiA) genes, complete cds; andunknown gene. 

.'"'.-■>■■' 




ORF Name 


NT 

' NTID AATD 

Length 


AA ' 

— . , Score 
Length 


Probability 


23620263_c3_271 


1143 ~ 13053 257 ^ 
1 


904- 534 


2.3e 


"bl . 


Protein name 






Locus Name 




Acc# . 








sp : KDSB_ECOLI 




' P04951 


Description,. . 












SYNTHETASE) (CMP- 2- 


KETO- 3 -DEOXYOCTULOSONIC 


ACID 


synthetase; (CKS) ; 






. ORF Name - 


NT 

NTID AAID ■ — , 
Length 


AA 

" . Score 
Length 


Probability 1 


23651513 J:l_43. 


|1144- • 3054 •■■ 259 


780 [589 | 


3/4e 


->7 " 


Protein name 






Locus -Name 




Acc# 








sp-:NADG_RHORU 




P7 7 93 8 


Description '_ ■ 












) (QAPRTASE) , 




ORF Name 


NT 

NTID 'AAID — , 
Length 


AA' 

. Score 
Length . 


Probability 


2403i;586_rl_5 . 


1145 | |3065 . 363 


1152 975 


3.3e- 


-98 


Protein name 






Locus Name 




Acc# 








sp:TftPD_ACICA 




P00500,' 


Description ■ 












AN THRANI LATE PHOSPHOR I BOS YLTRANSFERASE , .. , • . 




, ■! • - ■ ■ 
ORF Name : 


NT. ' 

NTID AAID — • 
Length 


AA 

* ■ — Score Probability. 
Length — — — *: 


p4397555_r3_140 


1145 • | |3065 219 


[660 | 165 


l,2e-ll 


Protein name 






Locus Name 




ACC# ■ 


■probable corA protein ' 


pir :F7 0952 . 


F70952 • 



Description , 



si! 



ORF Name 



NT ID 



AAID 



24415782 t3 106 



TTTT 



NT AA : 

Length Length 



Score Probability 
|87 | [0.0079 ~~ 



Protein name 



Locus Name 



UUP protein 



IgpiECUUP 



Acc# 
Y09439 



Description .. 
E.coli uup gene, partial . 



ORF Name 


NTIP 


AAID 


NT 
Length *■ 


AA 

— - Score 
Length 


Probability 


2472I962_c3_?^ 


| 114a 


3068. 


1 V 


234 




Protein name. 








Locus Name 


'■; Acc# 


Description 












pq-HiT 


ORF Name 


.. . "■ NT ID 


AAID 


NT ;. 

Length 


AA - 

. — . ; Score 
Length , 


Probability 


|2480i937_c2_23<S 


1149 


• : ; 3069 


63 


192 




Protein name 




','» •' 




■ ' Locus - Name 


; Acc# 


Description 


I, 










NO- HIT • • \ : . . 


ORF Name ■ ' ! , 


NT ID 


' AAID 


NT. 
Length 


AA 

— , Score 
Length 


Probability- 


|26756325_.ii_i 


1150 


3070 


314 


9.45 . |1021| 


|b.6e/-103 


Protein name - 


- . ?l 






Locus. Name 


. ,Acc# 










sp:OTCA_PSJJaH 


Q02 04 7 


Description ' 












(EC 2.1.3.3) (OTCASE)' . • \; ' 


ORF Name ' 


NT ID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|267577?0_c3_2bb 


| 1151 


3071 


■ 375 


' 1131 ■ |1065| 


|1.2evi07 * , | 


Protein name 








■Locus Name 


Acc# 



sp:NA£)A_ECOLI 



Description 
QUINOLINATE SYNTHETASE. A" 



P11458 :P77. 
373 



312 



ORF Name 


NT ID 


AAID. 


NT AA 
Length '.Length 


Score 


Probability 


2742036_ii_;3 


11S2 


3072 


144 


43 5. 




426 


6 .3e 


-40 


Protein name 










.Locus Name 




Acc# 












sp:PAND 


JBACSU 




P52999 


Description 




















decarboxylase ) ■ ■ . ■ 


ORF Name 


NTID 


AAID 


NT 
Length 


AA ' 
Length 


Score 


Probability 


29301691_i:l_i4 


1153 


. 3073 


248 


I 747 




140 


|4.7e 


-07 


Protein name 










. Locus Name , 




Acc#- 






" 'v. . 






sp : RND_HAEIN 


P4.4442 


Description 




















RIB0NUCLEA5E D, 


(RNASE D), 


















ORF Name 


NTID' 


AAID 


' NT 1 
Length 


■ AA . , 
Length 


''Score . 


Probability 


29355297^ ±2_99 


. 1154 , 


3074, 


74 • 


225 . , 




79 


|0 . 013 v, 


Protein name 










Locus Name- 




Acc# 


MutT/nudix tamily protein • ; 




pir:A75550 




A75550 


Description 

'.j ■ 




















ORF Name 


NTID . 


AAID • 


. NT 
Length 


AA 
Length 


. Score 


Probability 


29484626-_ci_i73 


1155 . 


3075 


355 . 


1071 






2 . 4e 


-57; 


Protein name 










Locus Name. 




-, Acc# 












sp:YGI2 


_PSEPU 




P31857 


Description 




















■HYPOTHETICAL 32 


.4 KB PROTEIN IN GIDB 


-UNCI INTERGENiC REGION 






ORF Name 


NTID . 


AAID' '•' ' 


'NT 
Length 


AA 
Length 


Score 


Probability 


2953i3S2_±i_40 • 


1155, 


3075 . 


124 


375 




173 


<l.-5e- 


" u ' ■ 1 


Protein name 










Locus Name 




ACC# 


. lustrin 1 A 








plr :T08852 . 




T08852 - 



Description 



313 



ORF Name 


NTID 


AAID 


NT 
Length 


AA , 
Length 


Score 


Probability 


29572155_t2_49 


1157 


| 3077 


343 


- 1032 


1406 1 


|8/3e 


-38 


Protein name 










Locus Name 




Acc# 












sp:HTRB_HAEIH 






Description 
















P45239:Q48 
045 


PROTEIN B) • 




-. ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 














29890.S3i_tS _56 


1158 


3078 


157. 


474^ 


259 


3.ie 




Protein name 










Locus Name 




Acc# 












sp:BCCP_HAEIH- 




P43874 


Description . 






• 












■BIOTIN _ CARBOXYL CARRIER PROTEIN OF ACETYL -COA CARBOXYLASE (BCCP) . , 




ORF Name ' 


NTID. 


AAID . 


NT 
Length 


:, AA 
Length 


Score ■ 


Probability 


3020483.7_c2_230 ■ 


1159 


3079 ■ 


447 


1344 




2;:5e 


-61 ' • 


Protein name 










Locus Name 




; . Acc# 












sp:GPDA_ECOLT 




P37606 


Description •. ...«•' 






















ORF Name 


NTID , 


AAID 


NT 

Length' 


AA . 
Length 


. Score 


Probability ". 


30330056 12 48 : 


1160 


3080 


233 


: 702 


1^1 


2vle- 


-48. . | 


Protein name 










Locus Name ■ 




Acc# 


putative ATP -binding protein 


gp:NME242841 • 


; AJ242 841 


Description, '-. 






Neisseria meningitidis DNA 


tor opcA 


region, 


strain 








ORF Name 


NTID 


AAID . 


NT 
Length 


AA 
Length 


Score . 


Probability 


32878_c2_250 


1161 


3081 | 


|80 


, 243 


8j _ r 


0.00033 


Protein name 






- u 




Locus Name 




ACC# 












sp:SLYX_ECOLI- 




P30857, 



Description 



SLYX PROTEIN 



314 



ORF Name 



NT ID 



AAID 



34180317 tl Tl 



11162 



NT AA 
Length Length 
1115 



Score Probability 
[420 | |2.7e^39 



Protein name 

Description 
PROBABLE GTP- BINDING PROTEIN HI 0 3 93 



Locus Name 



'sprYCHFJHAEIN 



Acc# 
P44681 



ORF Name 


NTID 


AAID 


NT' 
Length 


AA 

T — Score 
Length 


Probability- 


344i7090__13_lbl 






I 76 1 


■■231 • ' 






Protein name 








Locus Name 




. Acc#. 


Description 














NO-HIT - " 


ORF Name,- 


NTID 


AAID 


NT 
Length 


■ AA " 1 
T — . , Score 
Length 


Probability 


|3Si9SS02J:3jL!i3 


1154. 


3084 


616 - 


16.51- 531 


1.2e 


-61 


Protein name 








Locus Name 




. Acc# 


L-lactate permease 


UctP) 


homo log 




, pir:-PSS3S0. 




• F69350, 


Description 














ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ . Score 
- Length • 


Probability . 

•i 


3$335i37_ci_17$ 




|3085 . 


.330 


l ' 553 ; 302 | 


8.7e 


-27 



Protein name 



De script ion 



Locus Name 



Acc# 
. P52024 



DNA POLYMERASE 111, 


DELTA » 


SUBUM1T., 








ORF Name 


NTID 


AAID 


NT •'' 
Length 


AA 

Length - 


Probability 


|35353443_c2_21?> | 


1166 . 


3086 | 


347 


1044 |il28 | 


|2.6e-114 ] 


Protein name .« ;■ 








Locus Name. 


Acc# ' 



sp : PUPA_VIBPA 



P40607 



Description ... . 
AliUNYLOaUCClWATU aYMTHETASil, (IMP- -ASPARTATE LIGASE) 



315 



ORF Name 



NTID 



AAID 



35603128 13 150 



TTST" 



NT AA ' 

Length Length 
80 



Score Probability 



Protein name 
Description 
[NO-HIT 



Locus Name 



Acc# 



ORF Name 



NTID 



AAID 



3914143 c2 209 



TT61T 



NT ■ AA 

Length Length 
185 



Score 



Probability 
2.1e-29 



Protein name 



Locus Name 



ExbB protein 



gp:BPE132741 



Acc# 
AJ132741 



Description 



Bordetella pertussis hupB, t onB , exbB, exbD and basR genes and 
ORF1 (partial) . ; 



ORF Name 



NTID 



AAID 



3914811 c3 280 



TTUT 



NT , AA 
Length - Length 
1370 I . 11113 



Score 



[5"7T~ 



Probability 
|2.7e-87 



Protein name 



Description 



Locus Name 



sp:YHCM_fiCOLI 



Acc# 
P46442 



HYPOTHETICAL 43 


1 


KD PROTEIN IN RPLM-HHOA 1NTERGENI0 REGION 


(F375) 




ORF Name 




NT 

NTID AAID " — • . 

Length 


AA • 

_ ■ — tl ... Score ' 
Length 


Probability 


3937518-ci_171 




| 1170 | 3090: ■ 264 


795 518 


2.9e 


-60 


Protein name 








Locus Name 




Acc# 










sp:YGIl_PSEPU 


P31856 


Description^ 














HYPOTHETICAL 28 


9 


KD PROTEIN IN GIDB-UNCI INTERGENIC REGION 






ORF Name 




NT 

NTID AAID " , * — 

. Length 


AA 

„• — . , Score 
Length 


Probability . 


|3938818_ci_166 " 




| 1171 3091 | 65 * 


198 228 


6.1e- 


-19 


Protein name 








Locus Name 




ACC# 


PurA *. *' \ 




gp:AF010189 


AF010189 



Description 



Pseudomonas; stutzeri HrlC (ht±C) gene " partial cds; HisX (hisX)gene, 
complete cds ;' and PurA (purA). gene, partial cds. 



ORF Name 



NTID 



AAID 



3946931 t3 114 



TTTT 



TOUT" 



NT AA 
Length Length 
216 ■ 



Score Probability 



|4-.0e-54 



Protein name 



Description- 



Locus Name 



Acc# 
U89166 



Eikenella 


corrodens 


lysine 


decarboxylase (ECORLD) gene, completecds . 


ORF Name 




NTID 


* ' nt AA 
AAID , ——, , , — . Score Probability 
Length Length -■ 1 ■ 


3960250_tl_ 


25 


1173 


3093 253 , 


762 282 1.2e-24 



Protein name 



Locus - Name 



probable transcription regulator 



pir:T34763 



ACC# 
T34 763 



Description 



ORF, Name 



NTID AAID 



14022217 12 .6 8 



TTFT 



3094 



NT 
n 
UTO" 



AA 

Score 

Length Length — 



35" 



Probability 
10.00087 . — 



Protein name 



Locus Name 



hypothetical protein . F53A9.8 



pir :T16439 



Acc# 
T16439 



Description 
ORF Name 



NTID AAID 



4023443 c3 278 



[TT7T" 



3095 



NT AA, 
Length , Length 
145; 



Score 



Protein name 



■ Description 



Locus Name 



sp :NDK_PSEAE 



Acc# ' 
Q59636 



Probability 
l.be-47 : 



NUCLEOSIDE DIPHOSPHATE KINASE, (MDK) (NDP KINASE) 


ORF Name ' NTID AAID. , NT 

Length 


' AA ' 
_ — . , ' Score 
Length 


Probability' 


4101387_11_4 |1176 


3096, • 219 


660 |710 | 


b.le-70 • 



Protein name 

Description 
TRANSFERASE) 



Locus Name 



sp : TRPG_P5EAE 



Acc# . 
P2 0 576 



317 



ORF Name 



NTID 



AAID 



4I4bOO0 Tl 47 



TTTT 



NT ... 
Length 

415 K 



AA 



_ — _ Sco re Probability 
Length — — ■■ 2 



|1.0e-96 



Protein name 



Description 



Locus Name 



sp:TORl_HAEIN 



Acc# 

Q57242 :O05 
056 " 



KBC transporter atp 


-BINDING PROTEIN UUP- 1' 








ORF Name 


NTID 


• ; AAID ' ! 


NT 
Length 


AA 
Length 


Probability 


4181502_i3_i49 


1178 


,3098 


130 


■ 3 


93- 




Protein name 










Locus Name 


Acc# 


Description 














NO - HIT • • ' - V ' 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T — , , -Score 
Length 


Probability' 


|4182762_12_51 < 


1179 


3099 


3B8 


1077 374 


1.9e 7 49 


Protein name ' 










Locus Name 


'•■ Acc# 


tryptophan- -tRNA ligase, 








pir -. H70385 


H70385 ' 


Description 














ORF Name . 


NTID 


AAID 


. ■ NT 
Length 


AA 

T — Score 
Length 


Probability 


424203_£1_24 


1180'., 


. 3100.. 


350. . 


|10S3 • 542 


3.2e-S2 ; 


Protein name 










Locus Name 


■ .Acc# 


putative exodeoxyribonuclease (EC 3. 


1.11.2) .7 


gp:3CE87 


... 1 AL132674 


Description. 












— TTT" 1 


Streptomyces coelicolor cosmid E87. 


ORF Name. 


NTID 


AAID 


" NT ' 
Length 


AA n - 
r — . , Score 
Length 


. Probability 


4331262^13141 


1181 


3101 


.222 . 


669 |402 | 


|2.-2e-37- 


Protein name 










Locus Name 


Acc# 


probable corA protein 






. pir:F70952" 


... F70952 


Description . 















318 



ORF Name ■ 



NTID 



AAID 



■' NT • AA ■ 

t 1 ■ t " — Score 

Length Length — ■ • 



Probability 



T^nG Q30 - ci 174 1185 t 3102 


356 


1071 1504 1 


|3 . 4e-48 " ■ 


Protein name. 




Locus Name 


Acc# 






spiLPXK HAEIN 


' . P44491 


Description 






'" ' ; 


TBTRAACYLDI SACCHARIDE ■ 4 ' -KINASE, 


(LIPID A 4' 


-KINA^tl) 




ORF Name NTID AAID 


NT 
Length 


AA . ■ 
, — . , Score 
Length. 


Probability 


453.7837_t3_145 | 1183 - | |3103 




|651 | |477 | 


p.be-4b . | 


Protein name - 




Locus Name 


Ace# 


YciB homo log : ; ,• 




gp:AF114793 


AF114793 


Description" 


Vitreoscilla sp. YciB nomolog, putative transcriptional activator , putative 


outer membrane protein, BioA" homolog, and glutaminesynthetase 


nomolog genes, 


complete cds,; and unknown genes.. 








ORF Name NTID. . AAID 


nt 

Length 


AA 

^ .. — , Score • 
Length 


Probability 


4571880_rl_16 • 1184 | 3104 


| 215 


. 651 437 


4.3e-41 | 



Protein name 



YbeZ protein 



Locus Name 



gp:STY249116 



Acc# . 
.AJ249116.. 



Description 








Salmonella typhimunum yleB 


(partial ) „ ' miaB, 


ybeZ and ybeY (partial) genes . 


,. NT - 

ORF Name , NTID AAID — : ' 

Length 


AA' 

' — , Score .. Probability 
Length ; \ - . . ■ , 


4722125_c2_211 • .1185 


3105 465 


1398 


1197 1.3e-121 , 



.Protein name 



Description 



Locus Name 



sp:Y325JUAEIN 



Acc# 
P44640 



HYPOTHETICAL PROTEIN HI 03 2 5 



ORF Name 



NT ID AAID 



4793753 ci 155 



NT 
Length 
1141 



, AA 
Length 

1 425 



Score Probability 
1274 I |8.1e-24 



Protein name 



Locus' Name 



ExbD protein 



gp :BPE132741 



Acc# 
AJ132741 



Description 



' Bordetella pertussis hupB, 
0RF1 (partial) . 


tonB, 


exbB , exbD 


ana basR 


genes 


and 


ORF Name . NT ID . 


AAID 


NT 
Length 


AA . 
Length 


Score 


Probability 


4875005 tl_30 , 1187 , 


| 3107 


8^8 :| 


2487 J 


. 1231 


3,ie-12b . 



Protein name 



Locus Name 



hypothetical protein TM186 9 



pir:,P72202 



Acc# 
F72202 



Description 
ORF, Name 



NTID . AAID. 



4978375 c3 293 



3108 



NT ■ 
Length 

j&r — 




Protein 'name 



Locus Name 



beta-ketoacyl-acyl carrier, protein synthase 

III ' ' ' ' ' • ■ \ v 



|pir:B54b4b 



Acc# 
B64545 



Description 
ORF Name 



. NTID ' AAID 



5194058 11 35 



TTW 



3X0TT 



NT. • 
Length 
162 0 " 



. AA 
Length 
11853 



Score - Probability 
11258 I |3.8e-129 — " 



Protein name 



Description 



Locus Name 



sp:KEFX_HAEIN 



, Acc# 
P44933 



ANTI PORTER) - ■ • 


ORF. Name 


NTID 


AAID 


NT, 
Length 


AA ' 
Length 


Score 


Probability 


5975555_13i_lil 


| 1190 ■ 


| 3110 


759" | 


2310 


|3095| 


,.■ ,| 



Protein name . 



Description 



Locus Name 



sp-RiRlJ^COLI 



Acc# , 

P00452 :P78 
088 :P78177 



(RIBONUCLEOTIDE; REDUCTASE 1) (Bl PROTEIN) (Rl PROTEIN) 



ORF, : Name 


NTID 


AAID 


NT 

Length, 


AA 
. Length 


Score 


Probability 












636513_t3._126 


|1191 ■ 


3111 ' 


|81 


246 






Protein name 








Locus 


Name 


Acc# 


Description 






i 








NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA . 
Length 


Score 


Probability 


6.516 Jt2_45 


S | 1192 


3112 - 


361 


.J1066 


176 - 


7.2e-13 



Protein name 



Locus Name 



hypothetical protein 



pir:S76259 



Acc# 
S76259 



Description 
■ ORF Name 



NTID 



AAID 



6828128 c2 223 



TT5T 



NT AA 
Length Length 
201 



Score 



fT03- 



Probability 
10. 00066- — ~ 



Protein name 1 



Locus Name' 



phosphoglycerate mutase 



pir :G72260 



Acc# 
G72260 



Description 
■ORF Name 



NTID 



AAID 



16837827 ci 183 



NT 

n 

T6"F 



AA 

-r— Score 
Length Length 



T"3~5~ 



Probability 
3.8e-10 — 



Protein name 



Description 



Locus Name 



|sp:Y400_yVNV3 



. Acc# ■ 
Q55129 



HYPOTHETICAL ,18.3 


KD PROTEIN SLL0400 










ORF Name 


NTlD AAID 


NT 
Length 


AA 

— , Score , 
Length 


Probability 


786578_c3_ ; _259 


| 1155 3115 | 


86 


|261 | 55 


, 2.8e 


-05 


Protein name 1 






Locus Name 




Acc#' 


unknown 


|gp:AF114793 


AF114793 



Description 



Vitreoscilla sp : YciB homo log, putative transcriptional activator , putative 
outer -membrane protein, BioA homolog, and glutaminesynthetase homolog genes, 
complete cds; and unknown genes. 



321 



ORF Name 



NTID 



AAID 



969200 c3 270" 



3116 



NT. AA 
Length Length 
234' 



Score 



Probability 
5.2e-33 



Protein name 



Description 



Locus Name 



sp:GiDfc_EC 4 OLI 



Acc# 
P17113 



GLUCOSE INHIBITED 


DIVISION PROTEIN B 










ORF Name 


NTID AAID 


KIT 
IN 1 

Length 


AA 

, — . , Score 
Length • 


Probability 


j 970375J:2_67 


| |1197 | |3117 , 






840 | 921 | 


2 . 2e-92 . | 


Protein name 








Locus Name 


, • Acc#, 


probable GTP-bmdmg protein HlUJyj 






pir:I64150 


. I64150 : 


Description 












i- . . 
: ORF Name 


NTID AAID • 


, NT 
Length 


AA 

. — , Score 
Length 


Probability 


976531_t2_59 ... 


1198 • 3118 


171 




516 |437 | 


|4.3e-41 


Protein name 








't . ■ 
Locus Name 


■ Acc# 


YbeZ protein 








|gp:3TY249116 


AJ249116 


Description 












Salmonella typhimurium yleB (partial), miaB, ybeZ and ybeY (partial) genes. 




ORF Name . : 


NTID AAID. ' 


• NT • - 
Length 


AA 

' T — , i Score 
Length 


Probability 


9S592E_r3_124 


1199 3119 


128 




387 




Protein name 








Locus • Name 


"" Acc# . 


Description 












NO -HIT 




ORF. Name 


NTID AAID . 


.NT 
Length 


AA 

— , ,; Score 
Length 


Probability 


99b4828_c2_20U . 


1200 • 3120 


322 




969 |170 | 


. |3.9e-14- 


Protein name 








Locus Name 


\ Acc# 


TonB2 








|gp:AF190125 


AF190125 . 


Description 












Pseudomonas aeruginosa TonB2. (tonB2) 


, ExbB 


(exbB) , and ExbD 


(exbD) genes, 




complete cds . . 















322 



ORF Name 



NTID AAID 



1097b667 c2 74 



TIUT 



JTTT 



t NT AA 
Length Length 
376 I. 11131 



Score . Probability 



2 . 6e-5CT 



Protein name 



Locus Name 



thiamine-monophosphate Kinase 



|gp:D17333 



Acc# 
D17333 



Description . \ 

E~! Coii thiL gene, complete cds . 



ORF Name 


' NTID 


AAID 


NT 
Length 


, — . , Score 
Length 


Probability 


1250008i_c3_90 


1202 


3122 


270 • 


|813 | 


855 


2.2e 


1 


Protein name 










Locus Name 




Acc# 












sp:HIS6 


_AZ0BR 


i 


P2672.1 ■ 


Description' ' . 


















HISF PROTEIN 


:( CYCLASE) 
















ORF Name 


> " NTID: 


AAID 


NT 
Length 


AA 

T — , , Score 
Length 


Probability 


14572127_cl_50 




3123 : 


. (16 5 


498 


433 


l.le 


-40 . ' 



Protein name 



Locus Name 



sp:RISB_ECOLI 



Description 

(LUMAZINE SYNTHASE) (RIBOFLAVIN SYNTHASE BETA CHAIN) 



' Acc# 

P25540 :,P77' 
114 



ORF Name .■; 


NTID 


AAID 


. NT 
Length 


aa' ' 

* — , • Score 
Length ■• 


Probability 


197212 _ci_54 , 


1204 




3124 


284 . 


, 852. 


' . i ■ - 


Protein " name 










, Locus Name 


. Acc# . 


Description 












'. ■ 


NO-HIT ;• 


ORF Name 


NTID . 


AAID ; 


NT 
Length- 


AA 

— , Score 
Length 


Probability 1 


217^7011_i:3_32 . 


1205 


3125 . 


■ 317 


• 954 736. 


8.9e-73 



Protein name 



Locus Name 



YatJ 



gp:NGAJ2783 



' Acc# .« 
AJ002 783 



Description ' , / ; ; 
Neisseria gonorrhoeae aroK, aroB, yatJ genes and open reading! rame . 



3 23 



ORF Name 



NTID 



AAID 



11205 



NT ■ 
Length 
'201 } 



AA 
Length 
1606 



Score Probability 
|190 j |2.4e-30 " — 



Protein name : 

Description 
PHOy PHAT ID YLGLYCEROPHOS PHATA^ E A; 



Locus Name 



sp:^PA_HAEI;Kr 



Acc# 
P44157 



ORF Name 



NTID. AAID 



23916007 c3 94 



TTOT" 



TTTT 



NT 
Length 
196 



AA 
Length 
1591 



Score 



Probability 
1.0e-23 " 



Protein name 



Locus Name 



methylase 



|gp:LLCPJW565 



Acc# 
Y12736 



Description ' , ' 
Lactococcus lactis cremoris piasmid pJW56,5. DNAy abiiM, abuR genesand ortx. 



ORF Name 



NTID AAID 



23947167 .13 37, 



1208: 



TT21T 



NT 
Length 
653'. 



AA 

T Score 
Length - - 



Probability 
|756 | |6 . Oe-aV ~ 



Protein- name 



Locus Name 



penicillin- binding protein 3 



foir:354872.. 



ACC# - 
S54872' 



Description 



ORF Name 



NTID. AAID 



24256697 12 23 



TZUT 



TTZT 



NT 

Length 

soir — 



• AA 
Length 
|lb06 



Score' 



Prbbability 
l.Be-79 , — - 



Protein name 



Description 



Locus Name 



sp:MURP ECOLI 



Acc# 

P11880 :P77 
636 : 007100. 



(D-ALANYL-D- ALANINE -ADDING ENZYME ) ; ■ , - 


■ORF Name NTID, AAID 


NT 
Length 


AA 

, - — . , ' Score. 
Length 


Probability ; 


24353377_cl__45 : 1210; , .3130 


225 


|678 | |155 | 


|3.3e-li 


Protein name 




Locus Name 


Acc# 


hypothetical protein PAB0131 




pir:D75209 


D7 52 09 . 



Description ' 



324 



" . ORF Name 


NTID 


AAID : . 


IN 1 

Length , 


■i. AA 
Length 


Score * 


Probability 


26369016 t2_18 


1 1211 




3131 


233 


900 ■ 








Protein name 








'' ' >, 




Locus Name 




. Acc# 


Description 




















MO-HIT • 


ORF Name ., 


NTID 


AAID . .; 


NT 
Length 


• AA 
Length 


Score 


Probability 


13357336 ci- 49.'. 
. 


1212 




3132. 


55 


155 


I 50 1 


0.037 


Protein name, 


' ■ 










Locus Name 




Acc# 














sp:DHSD_PURPU ' 




P80479 ' 


Description 




















■.DEHYDROGENASE , 


3UBTJNIT IV) 


















ORF Name 


NTID 


AAID 


. NT 

Length 


AA 
Length, 


Score 


Probability 


359S8792j±2_24 


. ±213 ; 




3133 


201 


505 . 


. 345 


1 . 9e 


-31 


Protein name 






-J - '' ', ■ 






Locus Name 




, ACC# 














sp:TPiS_MORSP- 




Q01893 


Description 




















TR IOSEPHOS PHATK 


ISOMERASE , 


(TIM) : 














ORF Name 


NTID 


AAID 


NT 
Length 


AA ,: 
Length 


Score ' 


Probability 


3907500:±lj5 : 


■ ;, .1214 ■ 




3134 


341 , • 


1025 


227 


9.2e 


-18 


Protein name .. 












Locus Name 




Acc# 


nomoserine Jcinase homolog 


pir:T33726 


T33726 


Description 




















ORF Name 


NTID ' 


AAID ■ 1 


' ' NT ■ 
Length 


AA 
Length 


Score 


Probability 
















|393%65_cl_bl 


1215 




3135 . 


|183 


bb2 


| 2 04 | 


2.1e- 





Protein name 



Locus Name 



sp:NU5B_HAElN 



Acc# . 
P45150 



Description ^ . 

N UTILIZATION ^UM STANCE PROTEIN B H0M0L0G (Nll^M PROTEIN) 



325 



ORF Name 



NT ID 



AAID 



3553191 ci 44 



NT . AA ' 
Length Length 
b02 | |lb09 



Score 



llbOl 



Probability 
|7.7e-lb4 ■ — 



Protein name 



Locus Name 



glutamyl- tKNA synthetase 



gp:'AE139i07 



Acc# 
AF139107 



Description 



Pseudomonas aeruginosa hypothetical multidrug resistance protein (mdr) gene , 
partial cds; hypothetical transcriptional activator (act ) and glutamyl-tRNA 
synthetase (gltX) genes, complete cds; andtRNA-Ala and tRNA-Glu genes, 
complete sequence. 



ORF Name 



NTID AAID 



141703 13 36 



TTFT 



\JTTT 



NT 
n 
T7U 



AA 

t ~ — Score 
Length , Length — 



Protein name 
Description 



Locus Name 



Probability 



Acc# 



ORF Name 



*■ NTID ■ AAID 



4301943 fl 13,. 



[JTW 



AA 

_ ■ ~ ~ * , i .._ — Score 
Length . Length 

pnrnr 



NT 
n 



TTUT 



Probability 
|1.2e-102 



Protein name 



Description 



Locus Name 



sp : MRAY_HAE !N 



Acc# 
P45062 





■ORF Name 


■ NT 

NTID AAID " 

Length 


AA" 

T ■ — , Score 
Length 


Probability 


B111318_12_2^ . 


1219 |3139. bi> J • | 


1572 ' |756 | 


|6:8e-7b < . . 


Proteini;'.name 




Locus Name 


■"• Acc# 


probable 




gp:AF141U67 


AF141867 . 



Description 



vibrio chplerae ~ • ' ~- r ~ " 1 " "• ! 1 ™ " ~ ■ 

probableUDP-N-acetylrriuramoylaianyl -D-glutamate- -2 , 6 ^diaminopimelate 
ligase(murE) gene, complete cds. 



32 6 



ORF Name 



NTID AAID 



£ 7 b40b2 F5 3b 



1220 



13140 



NT AA 

t ~~Vu t ~ — Score 
Length Length - - -■ 



F 



36 



TuTT" 



Probability 
|2,0e-58 



Protein name 



Locus Name 



sp: YABC_ECOLI 



. ACC# 
P18595 



Description . 
HYPOTHETICAL 34.9 KD PROTEIN IN FRUR-FT&L INTERGEMIC -REGION (ORFB) 



ORF Name 


NTID AAID 


NT 
Length 


AA n 
. — Score 
Length 


Probability 


Iiibb2_li_i0 


1221 |3141 


73 


222 




Protein name 








Locus Name 


Acc# 


Description 












NO- HIT 


" ORF Name 


NTID AAID 


.NT 
Length 


AA 

, — , , ■ Score 
Length 


Probability 


1189035_c3_48 ' 


1222 • 3142 


,. 179 


540- .. GIG . 

s 


• 4.7e-60 


Protein name 








Locus Name 


- \ ' '' " Acc# 


adenylate Kinase ■ ■ 


gp:Afe024426 


AB024426 


Description 




■ Pseudomonas putida 


adk gene tor 


adenylate Kinase, complete 


cds . 


ORF Name 


NTID . ' AAID 


NT 

J ' • Length 


*. AA ; 

■ — . , Score 
Length 


Probability . 


12b/tf208_t2_15 


.1223 3143 


• 386 


1151 . 1124.4 


| |i. Je-126 ' 


Protein name 








Locus Name 


Acc# 



Description ■ 



sp:DHA3_PSEAE 



Q51344 . 



ORF Name 
|23444507_c3_45 

Protein name 
Description , 
[NO-HIT : T 



NTID AAID 



TTZT 



NT AA 
Length Length 
14.52 I 11359 



Score , Probability 



Locus Name 



Acc# 



327 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

t Score 
Length • 


Probability 


23536bll_i2_16 ... 


1225 




3145 


338 


1017 |210 | 


4.Ie-lS 


Protein name 












Locus Name 


Acc# 














sp:ASGl_ECOLI 


P18840 


Description 












•V 




(L-ASNASE I-) 


ORF Name 


NT ID 


AAID 


NT . 
Length 


AA 

„ ; Score 
Length 


Probability 


245682_tl^.8 v 


[1226 


1 


3146 


)303 , 


|912 |689 | 


B.be-fett ■ | 


Protein name 












Locus Name 


Acc# 


■ ■ . - 












sp:TRUA_ECOLI 


P07649- 


Description' 
















I) (P5EUD0URIDINE 


SYNTHASE 


I) (URACIL HYDROLYASE ) (PSU-I) 




ORF Name . 


NT ID 


.AAID 


NT 
Length 


AA 

T — . , Score „ 
Length 


Probability 


34ibV662_c2_4i 


, 1227: 




3147 


203 


612 321 


U.Se-29 


Protein name 












Locus. Name ■ 


Acc# 














sp:-TIPB_PSEFL 


P52237 


Description - 
















BIOGENESIS PROTEIN. TIPB) . 


ORF Name 


NT ID '. 


AAID ; 


NT 
Length 


AA ' " . ' ' 
„ — . , . Score 
Length 


Probability 


4042131_cl_33 


• 1228' 




3148. 


70 ' 


213 




Protein name 












Locus Name 


Acc# 


Description 
















NO- HIT • 


ORF Name 


NT ID ' 


AAID 


Length 


AA ' 
„ — , , ■ ■ Score 
Length •• 


Probability 


4112793_cl_37 ■ 


| |1229 


1 


3149 . 


423 | 


|1272. .175 | 


3.1e-10 - | 


Protein name 












Locus Name 1 


; Ac.c# 














sp : CCMH_HAU1N 


P46458 


Description , ' 
















CYTOCHROME C-TYPE 


BIOGENESIS 


PROTEIN 


CCMH. PRECURSOR 





\ 328 



ORF Name 



NT ID AAID 



NT 



4484436 c2 40 



TUJT 



AA 

T tzjL ,_, ■ T ~~ , Score Probability' 

Length . Length — - — — 1 

692 ' I 12079 



|i.7e-179 



Protein name 



Description 



Locus Name 



spi.CCMF PSEFL 



Acc# 
P52225 



CYTOCHROME C 


-TYPE BIOGENESIS PROTEIN 


CYCK ; 






ORF Name 


. NTID AAID 


NT 
Length 


AA 

„ — , Score . 
Length 


Probability 


5265800_tl_9 


• 1231 3151 


77 


234 274 


8.1e-24 


Protein name 






Locus Name 


Acctt 1 








sp:IFl_BAC3U ' 


P2 04 58 


Description • 










pTICysrsLATioN 


INITIATION FACTOR 1F-1 








ORF Name 

: — : — — ! ' . 


' ' NTID AAID 


NT 
Length 


• AA. „ . . 
, ■— , Score ■ 
Length 


Probability 


^87775_cl_3<!> 


| 1232 3152 


172 


519 . 299 


i.8e-2& | 



Protein name 



Locus Name 



sp: CCMH_ECOLI 



Acc# 
P33925 



Description .. . 

OTTOOTROME- C-TYPE BIOGENESIS PROTEIN CCMH PRECURSOR 



ORF Name 



NTID . AAID 



5894082. tl 7 



TTTT 



NT ' • ' AA 
Length Length 
205 



Score 



VS7T 



JUT 



Probability 
|1.5e-27 



■ Protein', name 1 ■ 

Description^ 
21.7 KT) PROTEIN IN FTSY-NIKA INTERGEN1C REGION 



Locus Name 



|sp:YHHE_ECOLI 



Acc# 
P10120 



ORF Name . 



NTID. AAID . 



1058462 c3 105 



3T5¥~ 



NT AA 
Length Length 




Score ' Probability 



|77 | [0.032 



Protein name ' 



Locus Name 



15 JcDa^ vesicular- like antigen 



gp : PFAVLAP 



Acc# 
.M94732 ' 



Description 

Plasmodium falciparum 15 kDa vesicular TliJce antigen gene, • exons lttirough. 4.-. 



329 



ORF Name ■ ' 


NT ID ' 


AAID *' 


KTT> 
N 1 

Length 


7\ 7\ 

AA n , 
T — . ; Score 
Length 


Probability 


|13.688802_C2^101 J 


1235 


1 l 31bb " 


IT 6 


■ 231 






Protein name 








Locus Name 




;' ACC# 


Description .' 


t ■ 












NO -HIT . . ■ - - . ■■■ ;,, ■ ■ 


ORF Name 


NT I D 


AAID • 


NT 

Length' 


AA 

. : — \ , Score 
Length ■ 


Probability 


14$4.45$$_fc2v28 




" 3156 


391 


1176 465 


4.7e 


-44 | 


Protein name 








Locus Name 




Acc# . 


36 kDa protein 








. gp:HPU8661Q 




U86610 


Description .' • , 




Helicobacter pylori 


36 kDa protein 


gene, complete cds . 






ORF Name 


NTID 


AAID 


NT 
Length 


' AA 

- — : - , Score 
Length. 


• Probability" ■ 


|16132787_t3_50 - 


1257 




| 105 


318 . |157 


|l/2e- 





Protein name 



Locus Name 



sp : VDCO^COLI- 



-. Acc# 
P76107 ' 



Description . ' 1 • " , 

HYPOTHETICAL 16. 1\KD PROTEIN- IN TEHB-ANSP iNTERGENIC REGION 



ORF Name 


NTID 


AAID .' < 


■ -.NT '' 
Length 


AA 
Length; 


Score -> 


Probability 


19532661__t2_33 


1238, 




3158 • ^ 


77 


234 






Protein name 

.... . ,., , 










Locus 


Name 


Acc# .... 


Description 
















NO-HIT v- . ■.v.-" • " : 


ORF Name ■ 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 














20335427^c2_78 


1239- 




3155 . 


657 


; 19.74 


235-. - 


2.0e-17 



Protein name 



Locus Name 



'minor tail protein gp26 -related protein 



pir:F75605 



. Acc# 
F75605 



Description 



33 0 : / 



ORF Name 


' NTID 


AAID 


NT 
Length 1 


AA . 
T — , , Score 
Length 


Probability 




; | |1240 




. 69 : 


pio 






Protein name 








Locus Name 




Acc# 


Description 














NO-HIT . , 




ORF Name 


- NTID 


AAID 


NT 

Length 


1 AA 

, ■ — . , . Score 
Length 


Probability 


21909377_t2_29 


"■ 14241. . 




. 401 


|1206 261 


. 3.0e 


-21 | 


Protein name 








. Locus. Name ■ 




Acc# 


hypothetical pro 


tein jhpl3* 


50 




. pir:<57ifli5 . " 


G71815 



Description 



ORF Name 



NTID 



AAID 



22266577 12 21 



NT . • . AA 
Length Length 
220 



Score 



I3TT 



Probability 



Protein^ name 



Locus Name 



thiamine -phosphate pyrophosphorylase 



E 



p :AF180145 



Acc# 
•AF180145 



Description 



Zymomonas ; mobilis GTP-binding protein CgpA (cgpA) , 6 OKD inner -membrane ~ 
protein yidC (yidC) , hypothetical protein, glutamine -pyruvate aminotransferase 
gltB (gltB) ', glutamate synthasesmall subunit 'gltS (gltS) , undecaprenol kinase 
udk (udk)"> hypothetical protein ;. • NADH- dehydrogenase hypothetical - ; 
protein; zml26rf 5 , hypothetical protein,; aspartate aminotransferase- ; . 
A, beta-hydroxysteroid dehydrogenase, phosphomannomutase pmm • 



ORF. Name 



NTID AAID 



NT " ■ AA 
t ' T ~~Vu Score 
Length Length — 



22894061 ci .119 



Probability 
|2.6e-18. 



Protein name 



Description 



Locus Name , 



v sp : TOLR PSEAE 



Acc# . 
, P50599 



TOLK PROTEIN 



ORF Name . NT ID AAID ' 


VTT 1 

TiRncrt - h 


AA 

T L(^v\cif~ }~i — : — 


. Probability 


23444426_t3_45 \ | 1244 ' . 3154 


419 


|1260 " 542 


. 8.3e-94 . 


Protein name 




Locus Name 


■ Acc# 


ATP -dependent heliease HrpA homolog. 




gp:D90779 




Description'' '' 






— r 1 D90779-.D90 






761:AB0013 
40 

• ' ■'■ . '■ 


E.coli genomic DNA, Konara clone #268 (31 . 6- 


3 z> , u mm . } .■ 




' ORF Name ' - . NT ID AAID 


NT 

' Length 


AA . o 
, • — ■ Score 1 
Length 


Probability 


|24021016_t3_43 • | 1245 | |3165 | 


( |1195 


| 3588 p^5| 


|1.6e-266 . ] 



Protein name 



Description 



Locus Name 



sp :MFD_HAEIN 



ACC# 
P4 512 8 



tRANSCRi^TION-^EPAIR COUPLING FACTOR - (TRCF) ; • 


ORF Name " NT ID AAID . . ■ 


NT 
Length 


AA 

— Score 
Length • 


Probability ; 


24401887_cl_62 | |1246 - 3166. 


114. 

■ 


345 . | 165 




-12 


Protein name v ■ . 1 




Locus Name . 




Acc# 


■ ■ i * ■ . . ' 




|gp:AB030825 




AB030825 


Description 










Pseudomonas aeruginosa genomic DNA, 


partial 


sequence/ strain 


PA01 : 




ORF Name NTID AAID- : 


■NT,, ,. 
Length 


AA ' 
— L , Score 
Length 


Probability ■•■ 












pb554561J:2_20 - | 1247. 3167. ' 


151- . 


456 |94 J 


[0.0015 


Protein name 




Locus Name 




Acc# 


hypothetical protein PH1Q0I 




pir :D71092 • 




D71092 


Description-' 










ORF Name . .NTID AAID 


NT 

Length 


AA 

t i • Score 
Length 


Probability '• , 


29?4032_c2_82 1248 ; | 3168 , 


264 


795 |307 | 


|9.6e 


-3,5 


Protein name 




' Locus Name 




Acc# . 


minor tail ■ protein gpl9 




• ■ pir:T1310b • 




. T13105 ,. 



Descr iptioii 



332 



ORF. Name 


NTID 


AAID 


' ' NT 
Length 


AA 

, .^-r. , Score 
Length 


Probability 




319i6632_tlJ4 ! 


■ 1249 




62 


; 189 






Protein name 








Locus Name 


Acc# 




Description 














NO-HIT 




' ■ ■ ■ . . . 










ORF Name . 


NTID 


AAID 


NT . 
Length 


AA 

T — . , Score 
Length 


Probability 




3320327_c2_76 


1250 


3170 


94 


285 






Protein name 








Locus Name 


Acc# 




Description 














MO -HIT ' 


ORF Name 


NTID 


AAID 


' NT 
Length 


AA 

^ — . , Score . 
Length 


* Probability 




34188892_cl_61 


|1251 


3171 


673 1 


. 2022 |227 | 


|4.5e-15 ' 




Protein name 








Locus Name 


Acc# 





. Description 



sp: VG2 6_BPMD2 



064220 



MINOR TAIL PROTEIN 


NT 

. ORF Name " NTID AAID — , 

■ ■ • Length - 


AA , ' 
T — . Score 
Length 


Probability , : '■ 


34415711_tlJ-0 1252 3172 368 


1107 ; |288 | 


|2.7e-2b 


Protein name 


Locus Name 


Acc# < 


.conserved hypothetical integral membrane 
protein HP1486 • . 


pir:P6470S . 


F64 705 


; - ■ "•. i.. 




Description 






ORF Name. NTID AAID J — . , 

Length 


AA 

T . — . ■ Score 
Length 


Probability 


35367058_t3_>l 1253 > |3173 76 


231 151 | 


8.8e-ll . 


Protein name \ ^ '. 


Locus Name . 


v . Acc#" 



sp-YDCQ EOTLI 



P76107. 



Description 

HYPOTHETICAL 16.1 KD PROTEIN" IN TEHB-AMSP IMTERGEMIC REGION 



333 



ORF Name 



NTID 



AAID 



AA 

— — Score 

, Length Length . -' - ■ " 



35942905 ET 19 , 



TT7¥" 



NT 



TUT 



Probability 
3 . le-24 ^ ' 



Protein name - , ' 

Description ' 
HYPOTHETICAL TONA/RRNA M^THVLTRAN$PERASE YTBK, 



Locus Name 



sp:YIBK^ECOLl 



Acc# 
P33899 



ORF Name ; 


NTID 


AAID 


NT 
Length . 


AA 

' — . Score 
Length 


. Probability 


3511875^2.104 . 


,,pbb 


1 pi'! 5 ■;, 


77 | 


P A 1 




Protein name 








. Locus Name 


Acc# 


Description 












NO -HIT ■ . . >- • „ • ; ; • ■ • 


ORF Name 


NTID 


AAID 


NT 
Length 


AA ' 

, Score . 
Length 


Probability 


3532S955_t2_23. 


|12b6 




| 10S 


. 327 




Protein name 








Locus Name 


Acc# 1 


Description 












NO-HIT ■ • • • • . . ' , • . •'■ 


, ORF Name 


NTID 


. AAID 


NT ■ 
Length 


AA 

■ — ; Score 


Probability 








Length 




3944450_c2 93 . 




|3177 


232 • 


599 . 422 :| 


1.7e-39 



Protein name 



Locus Name. 



ToIQ protein 



gp : PPPALl 



Acc# 
X74218 



Description 








Pseudomonas put i da,, ruvB , 


tOlQ, jtOJLR; 


tolA, toIB and- 


oprL genes . , 


' ORF Name NTID' 


AAID 


.... NT. AA 
Length Length 


Score Probability 


4460387 tl_9 , . | 12SS 


| 3178 


514 \ | 1545 ' 


|240 | p-.4e.-iV 



Protein name 



Locus Name 



Hypothetical protein jhpl382 • 



pir : A7 18,16 



Acc# . 
A71816 



Description 



334 



ORF Name 



NTID 



AAID 



4507705 c2 105 



JTTT 



NT 
n 



AA' 

t Score 
Length Length ■■ — — 



UTT 



Protein name 



Description 



Locus Name 



sp:Y014jBPHPi , 



Probability 
|2.0e-lr — ' 

: Acc.# • 
P51716 



HYPOTHETICAL" 14, 9 KD PROTEIN IN REP-HOL INTERGENIC REGION (0RF14). 



ORF Name 



NTID 



AAID 



14728415 c3 120 



1250 



][ 



TTF0" 



NT 
n 

T7F 



AA 

— Score 
Length Length -~ — - 

rnrr- 



Probability 
10.030 



Protein name 



Locus Name 



ras interacting protein RIPA 



gp:AP159241 



Acc# 
AF159241 



Description 



Dictyosteiium discoideum ras' interacting protein RIPA (npA) onRNA, complete 

cds . „ ' . " •• • ■ 



ORF Name 



NTID 



AAID 



4730050 .cl 73. 



TZZT 



;:nt AA 
Length ' Length 
439 I ' 11320' 



Score 



Probability \ s 
1.6e-34 



Protein name 



Locus ' Name 



TolB 



gp :HIU32470 



Acc# ■•• 
U32470 



Description 



Haemophilus mtluenzae tolQRAB gene cluster, inner, membrane protein VtolQ) ' ' 
gene, partial cds, inner membrane protein (tolR) , putermembrane integrity 
protein (tolA) and colicin tolerance protein (tolB) genes, complete cds.. 



ORF Name 



NTID AAID 



5282805 cl 63 



TTZT 



NT 
\Z2T 



AA ■ . . 
t ~ 4— i— . Score 
Length Length — 



Probability 



BP 



TIT 



i.2e-29 



Protein name 



Locus Name 



minor tail protein L homolog i protein gpl8 



pir:T13104 



Acc# 
T13'104 



Description 



ORF Name 



/NTID AAID 



5:448393 c2 83 



NT AA 
Length Length 

75 ; 



Score Probability 



Protein name 
Description 
[MO-HIT — : 



Locus; Name 



Acc# 



335 



ORF Name ? 


NTID 


AAID. 


Length- 


AA ' 
T L , Score 
, Length 


Probability 


682 7 77 c2 7 9 


1264 


3184 


13 9 


420 




Protein name 








Locus Name 


Acc# 


Description 












NO-HIT • • . 


ORF Name 


NTID 


AAID 


NT 
... Length 


AA 

T — . , Score 
- Length 


Probability 


7265950_ri_7 


| 1255 




1014 


3045 |726 | 


p.le-134 | 


Protein name 








Locus Name 


, . Acc# 



Description 



sp:HRPA_ECOLI T 



P43329 :P77 
479 :P76861' 
:P76863 



ATP - DEPENDENT 


HELICASE HRPA 
















ORF Name 1 


NTID ' AAID 


.,' NT 
Length 


AA 
Length. . 


Score 


Probability 


24113927_t2__l 


1266. ... 3185 


334 | . 


■ 1005" 


116 


0.001-2 — 


Protein name 


■ • . . h 






. Locl 


ls Name 




Acc# 


STARP antigen 








gp : PR STAR PA 


Z30339 . 


Description 












P . reich'enowi v 


STARP gene tor STARP 


antigen. 














ORF Name 


NTID AAID 


NT ' ., 
Length 


AA 
Length 


Score 


Probability . 


25674906_1I__I 


1267 3187 


206 . 


621. 


43 5 


7.0e- 


-41 


Protein name 








.' Locus Name 




Acc# . ■"- 



sp :YVCF_BACSU 



P37478 



Description 
INTEPGENTC REGION 



ORF Name 



NTID " AAID: 



29527207 t'3 4 



NT ., 
Length 
1230 I 



AA ; 
Length 

1^3 



Score Probability 
[88 | |2.4e-05 ~~ 



Protein name' 



.Locus Name 



probable two- component sensor protein 



bir:C70624 



' Acc# 
C70624 



Description 



ORF Name 



NT ID 



AAID 



35210305 tl 2 



1269 



NT AA 
Length -, Length 



Score 



Probability 
l2.3e-0^ 



Protein name 



Locus Name 



SmeS 



gp: API 73 22 6 



Acc# 
AF173226 



Description 



Stenotrophomonas maltqphilia multidrug ettlux system SmeR , (smeR) , SmeS 
(smeS) , SmeA (smeA) , SmeB (smeB) , and SmeC (smeC) genes , complete cds . 



ORF Name 



NTID. AAID 



12938586 c3 89 



TT7TT 



■ NT AA 
Length Length 
T52 



Score 



2W 



Probability . 
3.8e-26 



Protein ■ name 

Description • , % 

peptidoglycan-associaTED LIPOPROTEIN PRECURSOR 



Locus Name 



sp : PAL_P5EPU 



Acc# 
P43036 



ORF Name 


NTID 


AAID 


NT , 
Length 


AA Score 
Length 


Probability 


14237656_cl_39 - 


.-. 1271 




3191 


85 


258 




Protein name ./. 










Locus Name 


. Acc# ' 


Description • 


1 








■ ■ _ 'i 




NO-HIT • • •• - - ; 


ORF Name' \ 


NTID 


AAID 


V NT 
Length 


AA . 
• v — . , Score 
.Length 


> . Probability 

• -' ■' " ■ . ■ *' * f 


14492157_t3_31 


1272 


1 


3192 


276. • 


831 




Protein name 










. ^Locus, Name 


"' " Acc# 


Description .. 










i' 




pO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

; — . , . Score.. ; 


Probability 










Length 




H4875327_t2_18 


| |1273 




3193 


|. <*■/-. , 


2604 |1592 | 


|1.7e-i63 


Protein name 










* . Locus Name 


. Acc#"' 


membrane alanyl 


aminopeptidase 




. gp:APli>7493 


AF157493. 



Description 

Zymomonas mobilis ZM4 tosmid clone 42D7; complete. sequence . 



ORF Name 



NTID AAID 



12 74 



NT AA 
Length Length 
199 



Score 



Protein name 



Locus Name 



NrpO 



gp:PMU46488 



Probability 
|2.Se-0S ; 

Acc# 
U46488 ; 



Description 



Proteus mirabilis NrpS. ^nrpS) gene, . partial 'cds, NrpU InrpU) , NrpT (nrpT) , 
NrpA (nrpA) , NrpB (nrpB)\ NrpG (nrpG) and'lrpP (irpP) genes, complete cds. 



ORF Name 



NTID . AAID 



16180387 t3 36. 



TUT 



NT 
Length 
354 • 



" AA ' o - '• 
— - , . ' Score 
Length — : — — ' 



ITT 



Probability 
3^e-07, • 



Protein name 



Locus Name 



hypothetical protein RP367 



pir:H71693 



Acc# 
H716 93 



Description 
ORF Name' 



NTID AAID 



16507576- c2 64 



13195 



NT" 
Length 
TTS 



AA ^ 
t • Score 
Length 



TuTT 



¥F5~ 



Probability 
5 : .le-42 



Protein name 



Description 



Locus Name 



sp : smta_b;coli 



Acc# 

P36566 :P77 
586. ' • 



SMTA PROTEIN ■ . — - ;\ ' . ,\ , ;'.,\ 


ORF Name "V 


- NT ID 


AAID 


NT _■ 
Length 


AA . . 
T , Score 
Length 


Probability 


22li632<?_t2_14 \ 


1277 ; 


3197 


: 109 


■ 330 205 | 


|1.7e-16 ■ - 


Protein name 








Locus Name 


. ' Acc# 



Description 



sp:PAl_KLEPN 



P3 7446 



ACVLHYDkOLA^tl) 


■ (OUTER MEMBRANE. PHOSPHOLIPASE A) (OM PLA) 




ORF Name 


NT 'AA 
NTID AAID • — '■' I- — , Score 
•■- Length Length 


Probability 


22694056_±l_ii ; ; 


| 1278 3198 |637 , 1914 • [1865 | 


|2.1e-192 


Protein name 


Locus Name 


Acc# . 



Description 
CLPB PROTEIN 



sp:CL^_HAEIN 



P44403 



338 



ORF Name ■ 


NT ID 


AAID 


NT 
Length ' 


AA 

— ' Score. 
Length: 


Probability 


23495633_£2_lb 


1279 


3199 


. 349 


. 1050 


105 


J0.0072 


Protein name 








Locus Name 


Acc# 


ComB 


gp:AP027189 


; AF027189 


ucsCi, lption 

• • 














Acmetobacter sp. 
and unknown genes. 


BD41J lytB/ comB, 


comC, comE, and comF genes /complete cds; 




ORF Name > 


NTID 


AAID 


NT 
Length 


AA 

T — • Score 


Probability 














2376890_c2_b<5> 


1280 


3200 


90 ' 


273. - 






Protein name'' , ' 








Locus Name 


Acc# 


Description 














NO-HIT 




ORF Name 


NTID 


AAID 


NT 
Length 


AA 

T '-,i Score 
Length 


Probability . 


24355540 £3 38 


12.81 


3,201 


282 ; 


849 


291 


1.3e-2b | 


Protein name .* 








Locus Name 


Acc# 


ABC transporter potG , . 


pir:B71694 


: B71694 


Description 














ORF Name 


NTID 


aaid ; .. • 


. NT 
Length 


AA 

T — . ; Score 
Length 


Probability 


24643831_li_3 


1282 


3202 


34;6 


1041 


213 


3.2e-l6 


Protein name 








; Locus Name 


Acc# 


pnospholipase A 








gp :CCPLDA 


Y11031 . 


Description 














G.coli ,pldA gene. ' - 1 




ORF: Name • 

■ ■ - ■ . ••( 


NTID 


AAID 


J NT 
Length 


AA 

r — . , Score 
Length . • ; • - - 


Probability 




^!4783453_t2_2b 


|1283 


1 13203 


P 4 I 


JbVb . • 


73B 





Protein name 

Description 

PROTEIN 



Locus Name 



sp : CLPB_BACNO 



Acc# . 
P17422 



339 



ORF Name 



NT ID 



AAID 



25817157 F3 34 



3204 



NT- AA 
Length Length • Probability 

2b0 



T5T 



ITT' 



|i.4e-28 



Protein name 



Locus Name 



hypothetical protein 



gp :AHWAAA17 9 



Acc# 
Z96927 



Description 



Acmetobacter haemoiyticus waaA gene, strain ATCC - 17906 . 



ORF Name 



Protein name 



NTID 



11285 



AAID 



T2THT 



ct391 hypothetical protein 



Description . 



NT AA 
Length Length 
342 I 11029 



Score Probability 
|198 | |1.4e-l ' 3 



Locus Name 



pir:G72072 



Acctt . 
G72072 



ORF Name 



NT AA 

NTID AAID. • — n Score Probability 
— — — - Length . Length — : — — — — • — - J - 



32703125 12 .22 



Protein name 



hypothetical protein RP368 



Description 



TOT" 



"9175" 



T£~9~ 



Locus Name 



ir:A71694 



pir : 



S.9e-34 



Acc# 
A71694 



ORF : Name 



NTID 



AAID 



3^3&8941 tl 4 



TT5T 



Protein name 



competence protein ComF 



Description 
Pseudomonas stutzen JM3 00 



NT AA 
Length Length 
285 . 



Score Probability 



T5T 



Locus Name 



gp r£5T2T57T2 



4 . be-09 



Ac,c# 
AJ249742 



bioB (partial ) , cpmF and dpi. (partial) genes. 



ORF Name 



NTID 



AAID 



35943885 c2 GG 



T2W 



T2W 



NT AA 
Length Length 
413 I 11242 



Score Probability 



Protein name 



Locus .Name 



Acc# 



Description 



[NO-HIT 



ORF Name . 


NTID 


AAID 


NT 
Length 


AA • n 
„ — , , Score 
Length 


Probability 


4111633_t2_13 


1289 


: | 3209 . 


154 


455 




Protein name 








, Locus Name 


Acc# 


Description 










• 


NO -HIT 












ORF Name 


, NTID 


AAID 


JN 1 

Length 


Score 

•Length 


Probability • 




j (1290 


3210 


245,. 


741 [777 1 


|4.0e-77 | 


Protein name 








Lociis . Name 


- Acc# 1 .*■ 










sp:RNPHJ>5EAE 


P50597 


Description- 












NUCLEOTIDYLTRANSFERASE ) " • " -, 


ORF Name- 


• NTID 


AAID 


' NT 
Length 


' AA 

: „ — . v- Score- 
Length 


Probability 


4572203_ti_9 


1291 


3211 


329 


990 1117 1 


|5.7e-6y 


Protein name 








Locus Name 


• Acc# 


merozoite surface antigen 2 






U91655 . . 


Description . ." '.- 


Plasmodium . falciparum isolate V310, merozoite surtace antigen 


-2 (MSP-2) gene, . 


partial cds .-. . 












ORF Name ■", 


NTID 


AAID . 


1 'NT 
Length 


' AA ,"" ■ . 
T Score 


Probability 








Length. — — • 




|47972U2_c2_74 


. l|1292 


3212 


r 


| 204 | ; 




Protein. name 








Locus Name ' 


- Acc#' 


Description 












NO -HIT , • ' ■ 


ORF "Name 


, '., NTID 


AAID 


NT 
Length 


AA 

■— . , Score 
Length 


Probability 


|5082637_r3_33 


1293 


1 I 3213 


440 


1323 635, 


4-.'J?e-62 . 


Protein name 








hocus Name 


Acc# 


WaaA 








* gp:AF02636<5" 


AF026386.; 


Description 












Salmonella typhimurium strain LT2 


LPS core 


oligosaccharidebiosynthesis 


region, WaaY 


(waaY) gene, 


partial 


cds ;, Waa J 


(waaJ) ,WaaI (waal) 


, WaaB (waaB)' , 


WaaP (waaP)'i 


WaaG (waaG) , 


and WaaQ 


(waaQ) genes /complete cds; 


and WaaA (waaA) 


gene, partial 


cds . 











341 



ORF Name 


. NTID 


AAID 


"KT r P 

Jn 1 • 

Length 


t Score 
Length; > . - 


Probability' 


5132 8 3 8_11_5 


12 94 


3214 


255 ' " 


76 8 




Protein name 








Locus Name • 


- Acc# 


Description 












MO-HIT \: ; - • • ■■ 


ORF Name 


f NTID " 


AAID 


NT 
Length 


T — . Score 
Length 


Probability 


5867075_t3_2i), 


1255 


32i5 • 


-.202 


6,09 |i05 | 


[0.00049. 



Protein name 



Locus Name 



pilV protein 



pir :S 775-94 



Description 



ORF . Name 



NTID. 



AAID 



790700 -c2 65 5 



NT . AA 

Length Length 
1379 — I [TT3T7 



— ' : * • Score 



Protein name 



Locus Name 



hypothetical protein' TP0565 



pir :C71308 



Description 
ORF Name . 



NTID AAID 



9775283 ci 46 



TFFT 



NT , AA 

Length Length ' 
499 



Seore , 



Protein name 



Locus Name 



probable alginate O-acetylation protein 
(algl). ■ 



pir:D7130U;, 



Description 
ORF' Name 



T 



NTID 



AAID 



115627 cl 7 



T2W 



NT AA 
Length Length 
330 | 1993 



Score 



[S7T 



Protein name 



Locus Name 



sp:GLMU_lHAEIN 



Description 

ACE T YLGLUCO S AM I NE - 1 - PHOSPHATE ITCIDYLTttANS FERA^ E ) 



Acc# 
S77594 



Probability 
9.2e-08 • — " 



Acc# ' 
C71308 



Probability 
II . 8e-44 " 



' Acc# 
D71308 



Probability 
|l:0e-87 : 

Acc# 
P43889 
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■ NT AA , , , * 

ORF Name NTID AAID • — — - , Score , Probabili 

— — — • — - — • — — - * Length Length - — r — - . — — — — — — 



i6828178_ti_2 


1299 | 3219 


616 


1851 |2166 | 


|2 . 6e 


-224 


Protein name 






Locus Name 




Acc# 








sp : TY£A_HAE¥Nf 




P44.910 


Description 












GTP-BINDING PROTEIN TYPA/BIPA HGMOLOG 


ORF Name 


NTID AAID 


NT , 
Length 


."■ AA r, ■ 

„■ — . , Score 
Length 


Probability 


|32656465_ci>_i0 


1300 ■ 3220 


78 . 


237 






Protein name 






Locus Name 




Acc# 


Description 












NO-HIT • • . ■. : 


ORF Name ■ 


• NTID. , AAID . 


NT 
Length 


AA n 
■-— . : Score 
Length 


Probability 


3336053_ti_i 


1301 . 3221 


m | 


402 [618 1 


|2.?e 


-60 


Protein name' 






Locus Name 




Acc# 


outer membrane protein CD precursor 




pir:S39866 /• 




. S39866 • 


Description 












ORF Name 


NTID AAID 


■ •': NT 

Length 


AA ■ ' 
• •' " , , ' - Score 
Length 


Probability 


|iu97583i_c3_12 ■ ■ 


| 1302 | 3222 


92 


■ 279 ■ 






Protein name 






. Locus Name 




' •' Acc# 


Description 










* 


NO-HIT 












ORF ■ Name 


NTID AAID 


NT 
Length 


AA ;•• ■ 

— Score 
Length 


Probability 


15912757_cl_8 


| (13 03- |3223 


123 


1 372 | |86 | 


0.048. 


Protein name 






Locus, Name 




Acc# 


FIP2 






|gp:AF061034 




' , AF06103 4 


Description 












Homo sapiens FIP2 


alternatively ■ translated mRNA , complete cds 







343 



ORF Name 



NTID 



AAID 



22457187 FT 5 



11304 



■ NT AA 
Length - Length 
1250 v 



Score ; Probability 



7v0e-8 9 



Protein name .-.*"' 

Description 
HYPOTHETICAL PROTEIN, HI 0882 



Locus Name 



sp:Y882_HAfi™ 



Acc# 
P44068 



ORF Name 



135271883 11 4 



Protein name 
Description 
[MO -HIT , — 



NTID AAID 



NT AA 
Length Length 
[60 | |183 1 



Score 



Locus Name 



Probability 



Acc# 



ORF Name 



22848457 12 3 



Protein name 
Description / 
INO-HIT ~ 



NTID AAID 



TTOT" 



NT AA , 

Length Length 

— 



Score * Probability 



Locus Name 



Acc# 



ORF Name 



22853,375 13 5 



Protein* name 



Description 



NTID AAID 



TTFF 



NT AA 
Length Length 
239 I I7^T 



— * ■ Score 



Locus ;Name. 



Probability 



AccJ 



MO-HIT 



ORF Name 



|2929S968_c3_9 



■ Protein name 
Description 
INO-HIT — 



NTID 



AAID 



, NT AA 
^— — - Score 
Length Length — — ; 



3228 



Locus . Name 



Probability 



Acc# 



.344 



ORF Name 



NTID , AAID 



976678 ci 10 



11209 



TZTT 



NT 
n 
T5U 



AA 

— , Score 
Length Length - — - — - 



Probability 
7.3e-23 ~- 



Protein name 



'■ Locus Name 



sp : PRTR J>SEAE 



Acc# 
Q06553 



Description . 

TRANSCRIPTION REGULATORY "PROTEIN PRTR' (PYOSIN REPRESSOR PROTEIN) 



ORF Name 


, NTID 


AAID 


NT 
Length 


AA 

— , • Score 
Length 


. Probability 


989077_11_1 ■ 


1 1 1310 


13230 


122 


p* 9 r 




Protein name 








Locus Name 


. Acc# 


Description 












JNO-HIT - , . 


' ORF Name 


NTID 


AAID 


NT 
Length 


AA . 

— ■ . Score 
* Length 


Probability 


i062575_;c3_34 


. ; ' |i31i • 


3231 


| 10$ - | 


330 




Protein name 








Locus Name - 

Trr : ^ 


- Acc# 


Description 


v. 










NO-'HIT , - . \ : *' , ■. 


ORF Name 


NTID 


AAID 


NT 
• Length 


' AA- 

T — . „ Score 
Length 


Probability 


|111252a0_13_13 


j 1312 


3232 


146 | 


441 b!38 


. 8: beV52.. 


Protein name 








Locus Name 


, . Acc# . 


. nitU protein 


homo log HI 03 77 




pir :C64064 


C64064 



Description 



ORF Name 



11297092 cl 20 



Protein name 



NTID 



AAID 



TTTT 



TZTT 



NT 
TTT 



AA 

, — ■ ' . Score i 

Length Length • — : — — 



Probability 
174 | |3.9e-12 ~ 



Locus Name 



probable gamma -glutamyl transpeptidase 
precursor , , . 



pir :E70682 



Acc#/ 
E70682* 



Description 



345 



ORF Name 



NTID AAID 



11314 



15224" 



NT AA 
Length Length 
110 



Score 



TTT 



TTT 



Protein name 



Locus Name 



probable -gamma -glutamyl transpeptidase 



pir :T34901 



Probability 
|2.4e-17 ! 

Acc# 
T34901 



Description 
ORF Name 



NTID AAID 



NT . AA 
t " — . -i y Score 
Length Length - 



20727194 c2 24 



FT 



T77T 



1TF" 



Probability 
1.8e-23 



Protein name 



Description 



Locus Name 



AF017750 



Acc# 
AF017750 



Haemophilus ducreyi cytochrome Otype biogenesis -protein " : : 

(ccmH) recombinational DNA repair protein (recR) , manganese 
superoxidedismutase (sodA) , and CitG protein homolog. (citG) genes, 
completecds . 



ORF Name NTID 


AAID 


NT 
Length 


AA 

, ; Score 
Length ■ 


Probability 


21679025_jtl_l 1315.. 


3236 


420 


1263 - 11516 1 


|2.0e-155 


Protein name 






Locus Name . 


Acc# . 








. sp:NTPS_ECOLI 


- . ,.i 
— 1 P39171:E>76 


Description . ; 






-• .. . •• 


581:P76992 


nifs Protein homolog < ■• 


ORF Name ' NTID 


AAID [ ' 


NT ; 
Length 


AA 

. — >■ , . Score ■• 
Length. 


Probability , .. 


|317971=>2J:3ji4 ■ 1317 


3237 


18.6 ■ 


561 |224 


1.6e-18 


Protein name 






Locus Name 


Acc# 








sp :HSCB_JECOLI 


P36540 


Description 










CHAPER0NE PROTEIN HSCB (HSC20) , . 


ORF Name NTID 


AAID 


nt 

Length 


AA ' . 
— Score 
Length 


Probability 


|32244203_c2_26 1318 


| 323B. 


72 


\ 219 - |12B . 


b.0e-08 - 


Protein name 






Locus Name 


... Aec#- 



E 



p:VCH2ill22 



AJ231122 



Description 
Vibrio cholerae z61t gene. 



346 



ORF Name 



I3339U297 F2 b 



Protein name 



Description 



NTID 



AAID 



TJTT 



TZTT 



NT AA 
Length Length 
178 



Score 



Probability 
9.6e-37 — — 



Locus .Name 



sp:YEHP_HAE!N 



Acc# 
P446 7 5- 



HYPOTHETICAL. PROTEIN H10379 


ORF Name NTID . AAID 


NT 
Length 


— Score 
Length - 


Probability . 


|3<512967tt_±l_2 |1320 J 3240. 


| 112 


339 | |394 | 


1.8er3y 


Protein. name- 




Locus Name. 


: Acc# 






sp: YFHF_HAEIN 


P44672 


Description 








HYPOTHETICAL PROTEIN HI0376 ■ 


ORF. Name NTID AAID 


NT 
Length' 


■ AA n ' 
— , Score 
Length 


' Probability ' 


362203B2_c2_25 | |1321 | 3241 


120 


363 |174 | 


|3.2e-12 



Protein name 



Description 



Locus Name 



|sp:GGT_PIG 



Acc# 
P20735 



GLUTAMYLTRANSPERASE). (<3GT) - 


ORF Name ' NTID 


AAID. 


NT AA - . ' 
— j — : , ■ Score 
Length Length 


Probability-; : 


433193S_t2_9 |1322 


3242 


^>22 1569 1435 


. 7.6e-147 



Protein name 



Description 



Locus Name 



|sp:H3CA_HAEIN 



■ Acc# • 
P44669. 



CHAPEkOKfE 


PkOTEIN H5CA ,(HSC66)" 










ORF Name, 


NTID AAID 


NT . 
Length 


AA 
Length 


Score 


Probability . 


4332S39_t2 1 


10 ■ |1323 3243 


115 . 


- 348 • 


P" 1 


|7.5e-37 



Protein name 



Locus Name 



terredoxin. 



gp:A>'096864 



Acc# : 
AF096864 



Description 



PseucLomonas aeruginosa heat shock protein (hscB) ; neat shockprotem 66-KDa 
(hscA) , ferredoxin (fdx) , and nucleoside diphosphatekinase (ndk) genes, 
complete cds . 
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ORF Name 


NTID 


AAID 


NT, 
Length 


AA 

T — .v- Score 
Length • 


Probability 


5898553_c2__:>8 






I 113 1 


P 60 1 . 




Protein name 








Locus Name 


Acc# 


Description 




.. 








MO-HIT :" . i 


v. 










ORF Name 


- NTID 


AAID 


NT 
Length 


AA . 
— Score 
Length 


Probability 


|7070215_c2_27' 


- 1 1325 . 


| 3245 


|16i 


|486 | |35i 


|S.$e-32 



Protein name : 



Locus Name 



putative gamma-glutamyltranspeptidase 
precursor a . 



gp:P5T24974i 



Acc# 
AJ249741 



Description 



Pseudomonas " stutzen JM300 


gacS 


(partial) and ggtB 


(partial ) 


genes . 


• ORF Name, 


NTID 


AAID 


NT 
Length 


. AA 
Length 


Score 


Probability 


I291580tt_i3_10 


1326 ■ 


3246 


200 - | 


603 






Protein name 








Locus Name . 


Acc# • 


Description 














NO-HIT -, • • * r ' r ; • . ,.. . 


ORF Name ' . .. 


NTID 


AAID 


- - ' ' NT : ' 

. Length' 


AA 
Length 


>; Score ' 


Probability 


|20737503_£3_8 . 


1327; 


| ^247 


1 m , 


1116 


• 41S , 


|4.5e-35 



Protein name 



Locus Name 



probable permease perM nomolog (perM) RP630 



pir : E71668 



' ACC# 
E71668 . 



Description 
ORF Name " 



NTID 



, AAID 



22000293 c2 TT 



TTZW 



Tim 



NT AA 
Length Length' 
97 



Score 



Probability 
1.2e-31 v 



Protein name 



Locus Name, 



50S ribosomal protein homolog 



|gp:AF1537I2 



: acc# • 

AF153712 



Description 



Pseudomonas sp . BG33R ; strain BG33R 50S ribosomal protein nomologgene, 
complete cds-. v / " 



348 



' NT 
ORF Name NTID AAID ' . - — 
— ^— . . Length 


AA 

, — ; Score 
Length 


Probability 


23863307_t3_-9 1329 3249 261 


786 |194 | 


|2 .4e 


-lb 


Protein name, • 


Locus Name 




Acc# 




sp : YFGEJHAEiN 




086235 


Description ; * 








HYPOTHETICAL PROTEIN HI1225 .1 . 


( NT 

ORF .Name , t ■ NTID AAID — j _ i . 

- Length 


AA : . 
, — . ; Score 
Length 


Probability 


243 08 561_C3_17 1330 . 3250 - ' 182 j 


549 ' | |710 | 




-70' 


Protein name 


Locus Name 




' Acc# > 


phosphoribosyltormylglycmamidine ' - 


pir:AJECPC 






cyclo- ligase , : 5 ' - phosphor ibosyl- 5-aminoimidaz 






A25955 :B65 






ole synthetase ■.- _ 






026 


Description s : 








" ' ' NT 

ORF Name NTID AAID — . 

:'. Length 


■AA 

— , Score 
Length 


Probability 


|2S2772SI_c3_lS _ ■ | 1331,. | 3251 ( 131 


|396 352 | 


4 . 4e- 


-32 ■ • 


Protein name ■ ■ ■ . 


Locus Name. 




, Acc# 




sp:PURb__ECOLI 




P08178 


Description ■■. ., 








(PHOSPHORIBOSYL-AMINOIMIDAZOLE SYNTHETASE) (AIR SYNTHASE) 


ORF kame ' ■ . NTID , AAID : , . ■ — . , ' 
. -; , Length 


AA • : 
. — . _ Score . 
Length 


Probability 


6142915 c2_14 | 1332 | , 3252 228 . 


687 | |382 | 


|2 . 9 e 


-35 


Protein name ... ' 


Locus Name 




Acc# , 


5' -phosphoribosylglycinamide transtormylase . 


gp :STU6 8765 ■ 




U68765 


Description 








Salmonella typnimurium 5 1 -phosphoribpsylglycinamide transtormylase 


(purN) and 


5 ' -phosphoribosyl-5-aminoimidazole synthetase- 


(purl ) genes , complete 


cds 


. ■ NT 
. ORF Name . • NTID . AAID — , v- 
••• . Length 


AA . < n • 
— . Score ! 
Length 


•Probability . 


10744000^c3_102 1333 3253 309 


|930 (1094 | 


ru Oe- 


-11.0 : 


Protein name 


Locus Name 




Acc# 


probable Mn transport protein 

-i 


pir :G64063 






Description " ■ 






G64063 :C41 
833 



349 



ORF Name. 


NTID AAID 


NT 
Length' 


AA 

, . — ^ Score 
Length 


Probability 


1181631_11_2. . 


1334 -3254 


|5S8 


1677 |1333 | 


|4.9e-136 


Protein name 


■■ 




Locus Name 


Acc# 








sp:6 0IM_PSEPU 


P2 5754 


Description 






•• f . 




GO KD INNER -MEMBRANE PROTEIN . 




UKr iName 


NTID AAID 


NT 
Length 


AA . ; 

— , Score 


Probability 






Length 




13703378_c3_117 


| |1335 | 3255 




i |288 163 | 


|4.7e-12 


Protein name 






Locus. Name 


' "■■ Acc# 








.sp:-YEAQ_ECOLl 


v 1 P76246 


Description 










HYPOTHETICAL .8 . 


7 KD PROTEIN IN. GAPA-RND INTERGENIC REGION 






ORF Name 


■ NTID AAID 


NT 

Length 


AA 

— , Score 
Length 


Probability 


15Q31513_t3_43 


1336 32b6 


479 


1440 |1390| 


|4.be-142 , 


i . 
Protein name 






' ' Locus Name 


Acc# . ■ * . 








sp : THRC_METGL ; 


\ -P37145 ■ 


Description 










THREONINE SYNTHASE/. 




ORF Name 


NTID AAID 


NT 
. Length 


■ AA 

- _ Score. 
Length 


Probability 

■ *■ ■ " ■ • "t ■ 


i503?677_ci_64 / 


| 1337 |3257 




| ' 801, . . . 196 


i.be-15 


Protein name 






■ , i" * ■ 
Locus Name 


Acc#; 








. gp:DNINTREG r 


X98546" - 


Description 










D . nodosus : intB , 


regA, gepA, gepB, 


and gepG 


genes . 






ORF Name 


NTID ' - AAID 


NT 

Length 


AA 

— Score 
Length 


Probability 


15665903^cl_59 ■ 


" . | 1338 3258 


281 


|846 | |1074 | 


|1.4e-108 


Protein name 






Locus Name • 


Acc# 


I - 






sp:Y360_HAEIN 


P44661 ' 


Description 










HYPOTHETICAL PROTEIN HI0360 











3 50 



ORF Name 



NTID , AAID- 



15795825 cl" 6b' 



13259 



NT 
Length 
1165 



AA 

x — ^ Score 
Length 



Protein name 



Description 



Locus Name 



Probability 



Acc# 



NO-HIT 


ORF Name 


. NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


i9538327_ci_72 


1340 


3260 


1 ^ 1 


642 | 


218 


|7.0e-18 



Protein name 



Description 



Locus Name 



' sp:Y882_METJA 



Acc# 
Q58292 



HYPOTHETICAL 


PROTEIN M JOS 82 












ORF Name 


. NTID AAID 


NT 
Length 


AA \. 
Length 


Score 


• Probability 


197188_±3_42 


1341 3251 


345 


1038 


655 


1.3e-64 



Protein name . 



Description 



Locus Name 



sp:EMT_PSEAE 



Acc# - 
085732 



METHIONYL-TRNA EokMYLTRAN S FERA^ E , 


ORF Name ' NTID AAID 


NT ' 
Length- 


" AA n 
... — . , Score 
Length 


Probability 


20197175_13_44 . . 1342 ;3262 


415 


. 1248 453 


5.0e-47 


Protein name" 




i Locus Name 


■ Acc#- . 



■j Description 



sp ; SME_HAElW 



P43862 , 



SMP PROTEIN (DNA PROCESSING CHAIN' 


A) 






ORF Name '"' • NTID AAID 


NT 
Length 


AA 

— Score 
Length 


. Probability 


23440886_t2_27 | 1343 | 3263 


1 1 


1809 241 


2;le-19 -:: 



Protein name . - 

Description , 
HYPOTHETICAL " PROTEIN MJ.0678 



Locus Name 



sp:Y678_METJA 



Acc# 
Q58091 



351 



ORF Name 



NT ID AAID 



241290 12 31 



NT AA 
Length Length 
65 



Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT- 



ORF Name 



NTID 



AAID 



24244010 c3 106 



NT " AA 
Length Length 
53 — ~ 



Score ' Probability 



0 . 042 



Protein name 



Locus Name 



hypothetical protein Y105C5B.X 



pir :T26400 



Acc# 
T26400 



Description 



ORF Name 



NTID AAID. 



NT-" AA 
— ■ — - ,, Score 
Length Length - — — 



24253187 c'X- 107 



1 [T3?5~ 



201 



Protein name 



Description 



Locus Name 



Probability 



Acc# 



[NO-HIT 



ORF Name 



NTID AAID 



24255262 c2 9.6 



32 67 



NT AA 
Length Length 
TJtT 1 11293 



Score Probability 
. 187 3 I 12 . 7e-87 ~~ 



Protein name 



Locus Name 1 



conservea Hypothetical protein' 



pir:C75339 



ACC# 

C7 53 3 9 



Description 



ORF Name 



NTID AAID 



NT AA ■ 

- — " — , Score Probability 
Length Length •• — 



24256550 ,13 40 



TS5" 



14 . 7e-44' 



Protein name . 

Description , 
HYPOTHETICAL 17.2 KD PROTEIN IN T^X- 



Locus Name 



sp -.-YBAD^kCOLl 



. Acc# 
P25538 



RIBG INTERGENIC REGION (0RF1) 



352 



ORF Name ' 


NTID AAID 


NT 

Length . 


AA 

T — ^ , Score . 
Length 


Probability 


24337786 ir3 48 


| 1349 | |3269 


| 311 


|936 ■ 557 


2.1e 


-64 - >\ 


Protein name 






Locus Name 




Acc# 








■ sp:AttUI_BkUAB 




Q59174 


Description 


. j ' 










: ARGINASE , 


ORF Name 


NTID AAID 


NT , 
Length 


AA 

, — . , . Score 
Length 


Probability 


|24417762J:i 18 


| 1350 3270 


62 


|189 | |74 | 


(0.030 • 


Protein name , 






Locus Name 




Acc# 








sp : FMiAJJURMA 




P22 5 95 


Description . 












TYPE-1 FIMBRIAL 


.PROTEIN STJBUNIT 


PRECURSOR 








ORF Name 


NTID AAID 


NT 
Length 


AA 

T - — . , Score 
Length ... 


Probability 


24489626_12_2i 


13bl . 3271 


, 474 -■ 


1425 |1055| 


|6, 8 e 


-107 


Protein name. 






Locus Name 




Acc# 








< sp : THDF_PSEPU 




P25755 


Description" • 












POSSIBLE THIOPHENE AND FURAN OXIDATION PROTEIN THDF 1 


ORF Name 


NTID, AAID 


NT 
; Length 


AA 

. — , , Score 
Length- 


Probability 


24643777_t2_22 .. 


1352 . 3272 


352 


• |1059 |582 


4 . 3e- 




Protein name 






Locus Name 




Acc# 








sp:RIBD_ECOLI 




P25539 


Description 












RIBOFLAVIN- SPECIFIC DEAMINASE, - • 


ORF Name 


NTID AAID 


NT 
Length 


AA 

\ ' ~. , . Score 
Length 


Probability 


24643937_ti_4 


1353 |3273 


450 


. 1353 . 595 . 


2.0e- 


-68 


Protein name 






Locus Name 




Acc# < 








|sp:5UN_HAEIN 




P44788 


Description 












SUN PROTEIN [ FMU PROTEIN) 



353 



ORF Name 



NTID AAID 



NT 



" AA 



Length Lehgth 



Score Probability 



26603562_c2_86 




| 3274 


|3Q3 


912; 


|1047| | 















|9.9e-106 



Protein name 



Locus Name 



sp rFECEJHAEIN 



Acc# 
P44662 



Description 

IRON (111) DICITRA TE TRANSPORT A T P-BINDING PRO TEI N FEl'E HOMOLOG 



ORF Name 


NTID 


AAID 


NT . 

', : • ■ Length 


AA 
Length 


Score 


Probability 


2738783j;t2;Ji7 


1355 


3275 


\ 60 


183 






Protein name 










Locus Name . 


Acc# 


Description ; 


X 












T 


NO -HIT - , ' , • . - • 


ORF Name ■ 


' NTID 


AAID 


NT, 
'- Lehgttr 


AA : 
.Length 


Score . 


Probability 


27.S026-2_tl_i.. 




J 3275 


103 


: 312 


193 


3,-ie-lS " 1 . *. 


Protein name 










Locus Name 


■ Acc# . 


"Hypotnetical 


protein SCH24 


.04 ; 






pir :T36569 


T36569, 


Description 
















ORF Name 


. NTID 


AAID 


NT 
Length 


AA 
• Length 


' Score 


Probability 


|295390i5_cL_S2 




| |3277 


..41-7 | 


1254; 


, |66.6 |. 


2.3e-65 


Protein name 










Locus Name 


' ' Acc# . 












sp:YDHH^ECOLI 


P77570 ' 


Description 
















HYPOTHETICAL 


39.5 KD PROTEIN IN 


PDXH-SLYB INTERGENIC REGION 




. ORF Name 


'"■*. NTID 


AAID ' 


NT V 
Length 


AA , 
Length 


Score 


Probability 


30739700_tl_7 


1358 .■ 


|3278 


211 


636 


: , 244 


l:2e.-20 | 



Protein name 



Locus Name 



sp :YRDC_ECOLI 



Acc# 
P4 5748 



Description , ' , 
HYPOTHETICAL 20 . 8 KD PROTEIN IN AR0E-5MG INTERGENIC REGION 



3 54 



ORF Name 



NTID 



AAID 



54088143 E3 bb 



3279 



.. NT . , AA 
Length Length 
1106 . I • |T2t 



— : , Score Probability 



Protein name 



Description 



.. Locus Name 



Acc# 



[NO-HIT 



ORF Name 



NTID AAID 



3913215 ±2 26 



NT 
Length 
165 



AA 

- — , Score 
Length — ■ 



Protein name 
Description 



Locus ' Name 



Probability 



Acc# 



NO-HIT' 



ORF; Name 
3939063 



NTID 



AAID 



T37TT 



TZTT 



NT 
Length 



AA : 
_ — Score 
Length • 



] EH 



Probability 
B.-Se-bO ~ 



Protein name 



Description 



LocuS: Name 



. spiRISA PHOPO 



Acc# 
P51961 



RIBOFLAVIN ■ SYNTHASE ALPHA CHAIN, ■ ' 


ORF Name ■ '■. 


NTID \AAID 


NT. 1 '' 
Length 


AA ; 
, — :. , Score 
■ Length 


Probability . 


3942213_c2_97 


. | 1362 v 3282 ' 


367: 


(1104 . |930 | 


|2.5e-93 . j 


Protein name 






. Locus Name 


: Acc# : 








'.- sp:GCH2_PH0LU 


Q02008 


Description 










ORF Name 


NTID AAID 


NT ; 
Length . 


' AA 

-— , k Score 
Length 


Probability 


47859ii_rl>_29 


(1363 3283 


435 


1308- , 1321 


|9.1e-13b 


Protein name 






Locus Name 


Acc# 








sp.:0AT_M0AN 


P49724 



Description . 
Ad ID AMINOTRANSFERASE) 



355 



ORF Name 


NTID AAID 


NT 
Length 


'AA 

t Score 
» Length 


Probability 


i52140b2_±3_53 ' | 


1364 |; 3284 V " 


411 ' 


1235 |I095 | 


|5 . 4e 


-111 


Protein name. 






Locus Name 




Acc# 








sp:^Y_HAEIN 




P43836 


Description 












TYROS YL - TRNA SYNTHETASE , (TYROSINE- 


-TRNA LIGASE) (TYRRS) 






' ORF Name 


NTID AAID 


NT , 
Length 


— Score 
Length - — 


Probability 


5282b62_cl_63 


1J55, p85 


255 


|798- | 593 ■ 


3.2e 


-68 


Protein name 






Locus Name 




Acc# 


hypothetical protexn 






pir:B7194 7 




B71947 


Description 












ORF Name . ' . 


NTID AAID 


NT - 
Length 


AA 

. — . , Score 
Length 


Probability \ 


5070166_i:2_20 


1365. 328.5 




' 215. ; 






Protein name 






Locus Name 




Acc# 


Description . , :\ , 












MO -HIT • ■' • ... ' 


ORF Name ■ 


NTID AAID 


■ NT 
Length 


; ■ » 

■„ - — . , Score . 
Length 


Probability 


|6i47028ici_5fl 


1357 3257 


292 • 


, ,879 [570 | 


|5.-ee 


-87 ■ ■ ■ ■ , • 


Protein name 






Locus Name 




Acc# 








sp:YFED_YEk!PE 




Q56955 


Description^ 












CHELATED. IRON TRANSPORT SYSTEM MEMBRANE PROTEIN YFED 


ORF Name ' ■ . , 


NTID , : AAID 


NT 
Length 


AA 

t ~ ". T Score 
Length 


Probability , 1 , 


839752 tl^lS 


1358 3288 


60 - : = 


183 






Protein name 






Locus Name 




Acc# 



Description , 
INO^HIT ~- 



356 



ORF Name 



NTID AAID 



1867183 cl 68 



113-69 



NT 
n 

TT5" 



AA 

T ~~ t Score 
Length Length 

[HT7- 



Probability 
|2.7e-05 



Protein name 



Locus Name 



sp : YkAM_BACSU 



Acc# 
007931 



Description • 

HYPOTHETICAL 39.5 KD PROTEIN IN ST^-CSN INTERGEN1C REGION 



ORF Name 



NTID AAID 



I197077_t3^44 



TT7TT 



NT AA 
Length Length 
375 l' 11128 



Score Probability 
1178 I |8.2e-ll 



Protein name 



Locus Name 



hypothetical protein TM0 342 



pir:D72388 



Acc# 
D72388 



Description 
ORF Name 



NTID . AAID 



NT ' AA 

— ' , — ■ Score Probab ility 
Length . Length •■ : - : : -, ■ 



14-641008 13 46 



1371 



TTT 



wnr 



IT6TT 



] [ 



|6.2e-33 . 



Protein name' 



Locus Name 



putative thiol : disuiride interchange protein 



gp:AF05703i 



Acc# 
AF057031 



Description 



Pseudomonas 
(dsbC) gene, 


aeruginosa' putative . thiol : disulfide interchange 
complete cds . 


proteinprecursor 


ORF Name 


NT AA 
NTID AAID — , • . — , Score 
Leny Lh Leny Lh 


Probability: 


i5058126_ri_9 


• 1372 3292 - 204 ■ 61b- 183 


3 .6e-14 . 



Protein name 



Locus Name 



hypothetical protein 



gp:AF088857 



Acc#. 
AF088857 



Description 



Vogesella mdigolera indigoidine biosynthesis regulatory locus , complete T 
sequence. • - 



ORF Name 



NTID AAID 



158638 c3 82 



[XT7T 



NT - AA 
Length Length 
89 



Scores 



T7TT 



Probability 
7.2e-32.' " 



Protein name 



Locus Name 



sp:lMDH_ACICA 



Acc# 
P31002 



Description 
DEHYDROGENASE ) ( IMPDH ) " ( IMPD J 



357 



ORF Name 


NTID AAID 


in i 
Length 


AA 

, — t , Score 
Length 


Probability 




1374 3294 " 


• 61 


186 |85 | 


|0. 00086 


Protein name . 








Locus Name 


Acc# 


gamma- carboxymuconol act one decarboxylase i 


pir:B69129 


B69129 


Description 












ORF Name. 


NTID AAID 


NT . 
Length 


AA 

, — k , Score 
Length 


Probability 


|203b!J46bJ:2_2i 


\ ||1375 |329S 


|1S7 


1504 | 




Protein name 








Locus Name 


Acc# 


Description ' 












NO-HIT • •. 


ORF Name ' 


. NTID AAID 


NT 
■ Length 


■AA -score 
Length 


Probability 


20734687_c2_78 


11376 1 3.295 
I I 


1 1297 1 

J I 1 


• |894 | 642 | 


18 ..2e-63 • 1 
L — .. - ., _l 


Protein name 








Locus Name 












sp:YAAJ_HAEIN 


P44555 


Description. • 












HYPOTHETICAL 


PROTEIN HI0183 . 










ORF Name 


; NTID ■ AAID 


NT 
Length 


AA 

, — Score 
Length 


Probability 


216:720ii_ci_54 


| 1377 |3297 




186 |55 | 


(0/0095 


Protein name 








Locus Name 


Acc# 










sp; YYlOJfflTJA 


Q60309 


Description 








... 




HYPOTHETICAL 


PROTEIN MJECS10 










' ORF Name 


NTID ' AAID 


, - NT 
' Length 


AA \ 
, — . , Score 
Length 


Probability 


23439077_tl_5- 


| 1378- 3298 ■ / 


;|177 


|534 - |';,|i03 | 


10. 003b 


Protein name . 








Locus Name 


Acc# 


ORF MSV035 hypothetical protein 




gp:AF063866 


, AF063866 



Description 



Melanoplus sanguinipes entomopoxvirus, complete, genome. 



358 



ORF Name 



NT ID AAID 



NT: 



AA 



Length Length 



123471^11 8 



TTTT 



13299 



Protein name 



isoleucine-tRNA ligase, : isoleucyl- tRNA 
synthetase 



Description 



UTS" 



Score Probability 
11839 I • [2 .Oe-286 — 



Locus Name 



pirrSVECiT 



' Acc# 

B64723:S40 
549:A94277 
:A91325 :A9 



ORF Name 



NT AA 
NTID AAID — - ■ — _ Score ' 
. — — . Length Length - 



23652183 cl b6 



11380 



3300 



] [ 



777" 



12334 



Probability 
10 . 0 " 



Protein .name 



Locus Name 



outer membrane protein CopB' 



|gp:U69981 



Acc# 
U69981 



Description 



Moraxella catarrhalis strain 012E outer membrane protein CopB gene, complete 
cds . : . ' 1 



ORF Name 


NTID 


AAID 


NT 
Length 


AA ■ " 
- — Score 
Length 


Probability 


23865660_c2_77 




3301 


87 


254 




Protein name' 








Locus Name 


Acc# 


Description • 












NO-HIT • . '. 


... ORF Name' 


• NTID 


AAID 


NT 

Length" 1 


AA 

. — . Score 
Length 


Probability 


p5506316-ti_14- 


| 1^82 


3302 


228 


|687 | 554 | 


1.7e-53 .. 

■ j 



Protein name 



Locus Name 



sp:YlHA_ECOLl 



Description . 

HYPOTHETICAL. GTP- BINDING PROTEIN , IN POLA-HEMN INTERGENIC REGION 



Acc# 

P24253 :P76 
771 



ORF Name 



NTID AAID 



NT AA 
—7' — ' Score 

Length Length — - 



I2584717_t3_43 



"57" 



TTT 



Probability ■ 
|3.1e-08 



Protein name. 1 



Locus ' Name 



gamma- carboxymuconolactone decarboxylase 



, pir:B£9129 



Acc# . 
B69129 



Description 



ORF Name 


NTID AAID 


3 NT . 
Length 


AA 

T ■ — , , Score 
Length 


Probability . 


|25942137_r2_29 


| |1384 | |3304 


185 | 


b58 296 


3.8e-26 


Protein name 








Locus Name 


Acctt 










sp : FKUX_PSEFL 


. P21863 


Description 
























ORF Name 


NTID AAID 


Length 


, — , . Score 
Length 


Probability 

... .... ... 


2M6443i_c2L_49- 


• ■ | |1385 3305 


117 1 


SJ54 | p00 | 


|1.4e-^ 


Protein name 








Locus Name 


Acc# 



Description 



pir : FEKRV 



S7216.7 :S78 
■ 121:A00210 



ORF ' Name 


NTID AAID . . 


NT . 
• Length 


— Score 
Length ■ 1 


Probability 


|32056506_c3_8i . 


| 1386 3306 


401 1 


|1206 | |1486 | 


|3.0e-152 


Protein name 






Locus Name: 


Acc# 








sp : TMPHj.ACICA 


ir -j lUUz 


Description 










DEHYDROGENASE) 


(I'MPDH) tIMPD) 








ORF Name " .-. 


V NTID AAID 


\ NT . 
: Length 


. AA 

— , ■■ Score 
Length 


Probability 


|3247,7250_cl i _6b 


1387 .3307 


443 


1332 |1422 | 


|1.8.e-14b 


Protein name 






Locus Name 


' Acc# "" • 








■ .sp:YCDG_ECOLl 


P75892 


Description 










HYPOTHETICAL 48.1 KD PROTEIN IN WRBA- PUT A INTERGENIC REGION . 




ORF Name 


• ■ .NTID : AAID . 


NT-, 
Length 


AA. . . ■ 
— , Score 
Length ■ 


Probability 


34665952_t2_28 


■ 1388,. 3308, 


177 . ■ | 


534 | 371 | 


|4.3.e-34 • - 


Protein name 






Locus Name 


Acc# 




■ • . \ ' • ' ■ 




■" ., sp:L5PA_PSEFL 


Pi 7 94 2 


Description . 










PEPTIDASE) (SIGNAL ■ PEPTIDASE II) 


(SPASE II) 







360 



ORF Name NTID AAID . 


• NT 
Length 


AA* . n 
„ ™. t Score 
Length 


Probability 


|4775762_rl_15 - |1389 |3309 




759, | ' 593 


1.3e-b7 • 


Protein name 




Locus Name 


,Acc#, 






■ sp : VRAL_EC0LI 


P45528 


Description 








±1 X rr\J 1 ri.ni ± ± L-i-ilj j J. . j rvU. . rriyW 1 Hi ±1N J. IN i-Uj.f-i.J- 


-MTR INTERGENIC REGION (F286) 


ORF Name NTID ■'. AAID 


NT 
Length 


AA 

— , Score r 
■ Length 


Probability 


5910313_12_30 | 1390 piiO 


391> 


|1179 | |895 | 


|i.3e-yy 


Protein name 




Locus Name 


Acc# ' - 


homoserine 0- acetyl transt erase . 




Igp : LMMUTVX 


Y10744 


Description ,.; 1 


L.meyen metY and metx genes . t 


.ORF Name NTID- AAID 


NT 
Length 


AA 

T — , " - Score .• 
Length «_..... 


r 

Probability ■ 


5976592_t2_41 1391 3311 


152 


459 |276 | 


|5,0e-24. 


Protein name v : 




Locus. Name 


' , Acc# . 


1 LportX . 


■ . r - 


. |gp:LPU63641 


U63641 .. 


Description ' " . 


Legionella; pneumophila rpoD operon LportX,, LpdnaG, and LprpoDgenes, complete 


ORE. Name ... NTID AAID 


NT 
Length 


AA 

■ , — . , - Score 
Length .. 


Probability 


|8iS7S5_il_7 " 1392 . | |3312 


98 


2.97 . ;■. ' 




Protein name' ■ - 

• — 1 ~ — h ' 




Locus Name 


Acc#. 


Description 








NO- HIT • ■ ' , • 


ORF Name NTID AAID 


' NT 
Length 


AA 

-y — , , .Score 
Length 


■Probability 


9765832_t2_38 1393/ 3313 


458 


: 1377 |1102 | 


|1.5e^lil 


Protein name' 




■ Locus Name 


Acc# 


homoserine dehydrogenase' 




gp:L78665 . 


L78665 



Description 



Methylobacillus tlagellatunr 'aspartate- ammotransterase ( aat ) , membrane 
protein (orf-1) , homoserine dehydrogenase (horn) , andthfeonine synthase (thrC) 
thymidylate sythase (thyA) genes , complete c'ds . 



361 



ORF Name 



15773436 12 31 



Protein name. 



NT ID 



AAID 



11394 



J3314 



NT AA 
Length Length " 



Score 



p-TT" 



Locus Name 



probable . 24-sterol G-methyltransterase, 



pir : T03,845 



Probability ; 

ACC,# 
T03845 



Description • 

ORF Name . 
|1042312b_g2_44 

Protein name 
Description 
NO-HIT . \ 



NTID AAID 



TT91T 



T3TB" 



.'- NT AA 
Length . Length 
124 



■ Score Probability 



Locus Name 



Acctf 



ORF Name . 



NTID 



AAID 



1059202 F2 13 



3TT5T 



NT AA \ 
Length Length 
" 



Score .■ ' Probability 



, Protein name 



Description 



Locus Name 



. Acc#' 



NO -HIT • .. ..." \ " ' ; • ' 


ORF Name , . 


NTID . 


AAID 


NT 
; Length 


AA 
Length 


Score .". 


'•■ Probability; 


12933427_i2 2 5 


1397 


J i : l 7 


131 , , 


396 




239 


, 4.1e-20 . 



Protein name 



Locus, Name 



sp:DH^C_UCOLl 



' ' Acc# 
.P10446 



Description ■•- . 
SUCCINATE DEHYDROGENASE CYTOCHROME B-b56 SUBUN1T 



ORF Name 



NTID 



AAID 



20330461 13 19 



'; NT AA " 

Length Length 
[229 



Score . Probability ' 
|714 | |f : : 9e- 70 ~" 



Protein name 



Locus . Name 



tumarate reductase tiavoprotem subunit 



gp:AJB015757 



Acc# 
AB015757 



Description 1 •' . \ \ . t ■ 

Rhodolerax termentans genes tor tumarate reductase subunits , complete cds. 



3 62 



ORF Name 



NTID AAID 



214155 t2 9 



NT 
n 

75T 



AA 

, t t Score 
Length Length . — ^ — - 



Protein name 



Locus Name 



sp:0D0i_A20VT 



Probability 
|l.le-2-66' 

Acc# 
P20707 



Description , 



KETOaLUTARATE. ■ DEHYT)kOUENA£J t! ) 



ORF Name 



NTID 



AAID 



21501557 ±3. 27. 



[MuTT 



NT , AA 

Length Length 

so ■ 



— v, Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT 



ORF Name 



NTID 



AAID. 



21510531 12 5 



TT2T" 



NT AA 
Length Length 
381 I 11145 



Score 



Probability ' 



Protein name 



■ Locus Name 



tumarate; reductase llavoprotem sutounit 



|gp:AB015757 



Acc# , 
AB015757 



Description 



Rhodo te rax termentans genes lor tumarate reductase subumts,, complete cds . 



ORF Name 



NTID 



AAID 



V — , Score Probability 



23469010 fl 2b 



TSZT 



NT AA 
Length Length 
1 fTWT 



Protein "name 



Description 



Locus Name 



Acc# 



NO-HIT 



ORF Name 



NTID • AAID 



23855057 c3 53 



TT2T" 



NT AA 
Length Length 
183 • 



Score Probability 
552 | |76 | |0.018 ~ 



Protein name 



•Locus Name 



putative adhesin MAA1 



lgp:AP154922 



Acc# ■■ 
AF154922 



Description 



Mycoplasma arthntidis strain 158 putative adhesin MAA1 (maal)gene, complete 
cds. • 



363 



ORF Name NT ID AAID 


NT 

Length 


AA 

T - — Score 
Length 


Probability 


24241463_t2_10 1404/ |. 3324 


68 . 


207 m 


2.6e-07 . * ; 


Protein name 




Locus Name 


Acc# 






sp:0p01_HAETN 


P45303 


Description ■ 








KETOGLUTARATE DEHYDROGENASE) 




ORF Name NT ID AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


24251441_ri_4 | |1405 | |3325 


375 


1128 |129 | 


b.Be-05 • | 


Protein name 




Locus Name 


Acc# 


heme receptor 


|gp:VllJHUTA 


L27149 . 


Description 








Vifcno cholerae heme receptor (hutA) 


gene, complete cds. 






ORF Name . NTID AAID 


, NT 
Length 


AA ■ 
— ■ Score 
Length • 


Probability 


24427262_i2 J3 ■ , 1405 | 3326 


123. "\ 


372 234 


2.6e-18 


Protein name 




Locus Name 


Acc# 


alpha-ketoglutarate dehydrogenase 


|gp:AF06«740 


A.F068740- 



Description 



Pseudomonas - putida dihydrolipoamide succinyltransterase ■ (kgdB) 
andalpha-ketoglutarate dehydrogenase (k«gdA) genes/ complete cds. 



ORF Name 



NTID 



AAID 



26367192 ti 2 



13327 



NT 
n 



AA 

* i t Score 
Length L ength r> — — • 



1467 



EH [ 



Probability 
3 . Oe-143 ' " 



Protein name 



Locus Name 



dihydrolipoamide dehydrogenase 



- tap :PSEL!Pt)A 



Acc# 
M28356 



Description ' 
P. tiuorescens dihydrolipoamide dehydrogenase (Ipd) gene, completecds . 



ORF Name 



NTID 



AAID 



NT AA 
T T ~ — ^ Score 
Length Length — 



26377042 .12 7 



T5T 



|7W 



Probability 
|1.7e-78 ~ 



Protein name 



Locus Name 



succinate dehydrogenase putative iron 
sulphur 



Acc# . 
Y13 76 0 



Description .... 

Shewanella f rigidimarma' NCIMB400 sdhA, sdhB, sdhC, sdhD and sucAgenes . 



3 64 



ORF Name 



NT ID 



AAID 



3143937b- 11 1 



T3TT9" 



NT AA , 

Length Length 
135 



Score 



Probability 
IS.3e-24 



Protein name 



Description 



Locus Name 



sp-:-DHSb_ECOLr 



Acc# 
P10445 



SUCCINATE DEHYDROGENASE HYDROPHOBIC MEMMANE ANCHOR 


prOTETN 




- : NT AA 
ORF Name . NT ID AAID — , - — , 

■ Length. Length 


Score 


Probability 


4054425,_t3_25 j 1410 | (3330. | \B99 . J1800 


1 P 3 ' 


|8.1e-05 • 



-Protein name 



' Description 



Locus Name 



sp : EOXA_yALTY 



Acc# 
Q56145 



FERRI0XAM1NE 


B RECEPTOR PRECUU^OU {FRAGMENT) 


■i- 




ORF Name 


" NT 
NT ID AAID . —7 \ 
Length 


AA .. . 
T — -, v 1 Score 
Length 


Probability . 1 


S8462b i jt2_li 


1411 3331 420 • 


1263 |1134 | 


p.^e-121 • 


Protein name 




Locus Name' 


Acc# :. 


dihydrolipoamide -,. 
S-suc'cinyltransf erase, : 2 -oxogluturate 
dehydrogenase complex chain E2 : succinyl 


pir:y07779 ' . 




flip: obi 
. 511- 


Description 









ORF Name. 



NTID 



AAID 



9928130, cl' 34 



NT AA 
Length Length 
141, 



1426 



Score ,' 
1 115 I 



Probability 
i8.ke-0 7 , ■ , ■ 



Protein name 



Locus Name 



.microfilaria! >sheatn protein SHP3" precursor 



Description 



|gp:AF03U944 ' 



' Acc# 

AF030944 :U 
4 3.510 , 



Brugia malayi microt ilarial sheath protein"SHP3a (Bmshp3a) . andmicrotilarial 
sheath protein SHP3 precursor (Bmshp3) genes , complete cds . 



NT AA 

ORF Name ' NTID AAID _ . — , — Score 
— ~ -■ — — — — Length Length 



112619081 c3 114, 



"3T3T" 



ITT 



Probability 
1.7e-16 ~ 



Protein name 



Lo.cus Name 



sp :YUAN_ECOLI 



Description .' • 
HYPOTHETICAL 14.8 KD PROTEIN IN PRIC-APT 1NTERGENIC REU10N 



Acc# 

P45808 :.P77 
478 



,365 



ORF Name 


NT ID 


AAID 


NT 
Length 


• AA 

, — : Score 
Length 


Probability 


XZoj/bbZ C-L 15 












Protein name 








Locus Name 


Acc# 


Description 












NO-HIT *■'■■. . ' • • . ' 


ORF Name 


■ NT ID 


AAID 


" NT 
Length 


AA 

, — . , Score 
Length 


Probability 


1359776_c2_91 


1415. 


3335 


67 : 


204 :' 




Protein name 








Locus Name 


Acc# 


Description' 












NO-HIT • • " 


ORF Name 


NTID 


. AAID 


NT 

..Length 


AA ■'■ . 
— Score 
Length 


..Probability 


1406402a_ca_i.0b 


■ 1416 


. 3336 


269 


|810 |182 | 


- |4..5e-14 ' ' 



Protein name 



Description 



Locus. Name 



sp:YEAB_UC0Ll 



ACC#; 

P4333 7 



HYPOTHETICAL 21 


4 KD .PROTEIN IN PABB 


-SDAA .INTERGENIC 


REGION 






ORF Name ' 


NTID ! ' AAID 


NT - . AA 
Length 'Length 


Score ' 


'Probability '-' 




146 57782_c2_104 


1417 3337 


173 ■ , 522 


333 


4.Se-30 





Protein name ... " ■ -- ( . 

Description ' '.\ . > 

2) (MB SYNTHETASE 2) (DTBS 2) 



Locus Name 



sp:B!Dl>_HAEIN 



Acc# 
P45248 



ORF Name 



NTID. 



AAID 



14719437 11 22 



nt aa : 

Length Length 
63 I ' TTF2 



Score . Probability 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT 



366 



ORF ; Name 



NT ID AAID 



114882713 c3 116 



T5T5" 



JJJT 



NT 
Length 

mi 



AA 
Length 
1854 • 



: Score ! « Probability 



8.1e-30 



Protein name 



Locus Name 



sp:B10C_HAEIN: 



Acc# 
P45249 



Description . 
PUTATIVE BIO T IN SYHTHE5T5 PROT E IN B10C, 



ORF Name 



-NTID 



AAID 



154647b0 c2 85 



TT2TT 



NT . 
Length 
325 



AA 
Length 



Score ' Probability 
[T2T~ 



|5.0e-05 



Protein name 

Description ' . 

CELL DIVISION PROTEIN • ZIPA 



Locus Name 



sp:ZlPA_ECOLI 



Acc# 
P77173 



ORF Name ' ' 


NTID 


AAID 


V NT 
Length . 


AA . . 

Score 

Length- 


Probability 


15532256_tl_l' 


- 1 1421 


• 3341 


80 


243 . |95 j 


|0.0015 ; - • 


Protein name 










Locus Name 


-Acc# . 


ubiquitm protein, ligase 








pir-:T39b8b 


T39585 


Description ■ 














ORF' Name 


NTID 


AAID 


NT 
Length 


. AA 
Length 


Probability 


19572130_13_58 


i422 


3342. 


310 | 


, 933 ; | |1010 | 


|8.2e-102 


Protein name 










Locus Name 


Acc# ' 








• ■- 




sp :<L'YSM_ECOLI 


P16703 


Description 














(O-ACETYLSERTME 


(THIOL) -LYASE B) 


(CSASE B) 








ORF Name 


■ NTID 


AAID 


NT' 
. Length 


AA 

, — , , Score 
Length 


Probability 


19734630 £2 40 

— ■ — - 


1423; 


|| 33 4 3 


1 I" 3 


1502 |400 ] 


|3.1e-b7 | 



Protein name . ' 

Description 
HY&QTHETJgAL RNA METHYLTRANSF^RA^E HI0333, 



Locus . Name 



sp:YGCA_HAEIN 



Acc# 

P446.43 



.ORF Name 



NTID 



AAID 



20507762- 11 13 



T3¥3~ 



NT 
Length 
290 



AA 

" : — ^ - Score Probability 
Length — . ■ — • — — L . 



Protein name 



Description 



|873 | |b48 | | 7.Se-b3 ~ 

Locus Name . . Acc# 



P10740 



PHOSPHAT1DYLSERINE DECARBOXYLASE PROENZYME, 


NT- 

ORF Name' NTID AAID ;. — ' 
• 1 : .,• . • Length 


AA 
Length 


Probability 


20839062_c2_92 | {1425 | 3345 444 | 


|1335 | |482 | 


|1.3e-48 | 


i, • •• • • , . " . • 
Protein name ' 


Locus Name 


Acc# 




' sp:DEAD_HAEIN 


P44586 


Description "■ , . 






■ ATP - DEPENDENT RNA HELICASE DEAD HOMOLOG ,. ' 


ORF Name NTID AAID' ■ — , 

• Length 


AA 

i ■ — Score , 
Length 


Probability 


22144026_tl_2,5 . 1425 3346 '284 


855 |443 | 


|4.3e-41 . 



Protein .name 



Description ■ . 



Locus Name : 



,sp:kt!LA HAEIN 



Acc# , . 
P44 64 4 



(PPGPP SYNTHETASE- IK \r ^ . • 


ORF Name , " NTID 


AAID. 


NT..; 
Length 


AA 
Length 


Score 


..Probability - " 










22147806 t3_47 . 1427 


- -3347 r 

- 


421 


|1266 


\66$ | 


|i.5e-86 



Protein name 



Description 



Locus - Name 



lsp.:YSXClEC6Lr 



Acc# 
P24196 



(0386J ' " V • ' :- . 1 . " 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


22691.300_c.2_99 


1428 


3348- 


612 . 


1839 


|688 | 


|7.1e-72 



Protein .name 
sensor kinase rtpA' ; , 



Locus Name 



gp:-AB002529 



Acc# ■ 
AB002529 



Description < 
Pseudomonas tolaasn gene' tor sensor kinase. rtpA, complete cds .. 



368 



ORF Name 


NTID 


AAID 


NT 
Length 


AA ' 

— . , Score 
Length 


Probability 


22S909I7_12_33 


. 1423 


3349 


258 | 


|777 - | 195 


l.Be-15 . 


Protein name 








Locus Name 


Acc# 










sp:YBEN_ECOLI 


P52085 1 


Description 












HYPOTHETICAL 24 


5 KD PROTEIN IN PHPB 


-HO LA INTERGENIC REGION (ORFUU) 


ORF Name 


- NTID 


AAID 


NT 
Length 


AA 

— Score 
Length , - 


Probability 


2347'8458_c2_i03 


| |1430 


| 3350 


|44i .. | 


1325 (1328 | 


1.7e-135 



Protein name 



Locus Name 



BioA 



|gp:AF191bb6 



Acc# 
AF191556 



Description 



Xenorhabdus nema tophi lus- YbhE (ybhE) gene , partial cds ; Varl (varl)and BioA-. 
(bioA) genes;, complete cds ; and unknown gene. .. i 



ORF Name 



NTID AAID 



p409-7812_ci_83 



13351 



NT AA 
Length Length 
[277^ 



Score Probability 



Protein name 



j |627 | ' |452 | | 9.7e-44 .■ — 

Locus Name Acc# 



! sp:5SB_HAEIN 



P44409 



Description ■ ' ' . 

SINGLE -STRAND BItfDINU PROTEIN (ttULlZ^U£TA&lLL2im PROTEIN) 



ORF Name ' ■ 


NTID • 


AAID ., 


NT 
Length 


AA *' 
„ — . , • Score , 
Length 


.Probability • 


24225088_t2_-39 


1432 


3352 


61 


186 




Protein ' name : 








Locus Name 


: • acc# 


Description 












NO -HIT , • • • • 


ORF Name 


NTID 


AAID 


NT 
Length 


AA , : n ' - 

, — . , Score 
Length 


Probability ' 


25587827_C1_79 


1433 


3353 


| 400 


.' 1203 888 


V.0e-89 


Protein name 








Locus Name 


Acc# 



Description 

LIGASE) 



sp:BIOF_HAEIN 



P44422 



369 



ORF Name 



NTID AAID 



25192150 FT 64 



NT 
n 



AA 

t — Score 
Length Length . — 



Probability 
1 . 6e-89 



Protein name 



- Description 



Locus Name 



sp : RELA_ECOLI 



ACC# 
P11585 



(PPGPP SYNTHETASE I) • 


ORF Name NTID 


AAID 


NT 
Length 


AA . 
Length 


Score. 


Probability 




1 P 355 


|pW | 


636 1 


.pso | 


|4.7e-35 | 



Protein name 



Locus Name 



Description 



Acc# 

U90439 :AE0 
02093 j. 



Arabidopsis thaliana chromosome II section 22 7 of 2 55 ot the complete 
sequence . 



ORF Name 



NTID AAID 



33708181 ci 



3356- 



NT AA 
Length Length 
411 | |1236 



Score Probability 
PUT" 



3.2e-3S 



Protein name 



Locus Name 



putative histidine kinase 



|gp:PS.T249741 



Acc# ■ 
AJ24 9741 



Description .. ■ . . . . ' 

Pseudomonas stutzerr J1VI3 00 gacS" (partial) ana ggtB (partial) genes 1 . 



ORF Name ': 


NTID AAID .'■ 


NT- - 
Length' 


: AA 

, ; — . v Score' 
Length 


Probability 


33728258_c3_117 


1437 


3357 . 


T 8 1 


. 237 , |85 | 


|0.00I9 .. • , 


Protein name 








, . Locus Name 


Acc# 










sp:BlD2_HAEJ:N 


. P45248 


Description 












2) (DTB SYNTHETASE 


2) (DTBS 


2) 








ORF Name 


NTID AAID 


NT 

. Length , 


AA ' 

— , Score 
Length 


Probability ■ ■ 


35i73953J:i_7 


1438 


3358" 


1.52 : | 


453 • 237 


6.8e-20 | 



Protein name 



Locus Name 



sp : YBKB__KCOLI 



Description 

HYPOTHETICAL 11.6 KD PROTEIN IN MkDA-PHPB INTERGENl^ REGION 



Acc# 

P05848.:P77 
107 • 



370 



ORF Name 


"NTT* T 'P* T\ 7V Tr\ 


NT 
Length 


AA 

~ r — . , Score 
Length . . 


Probability 


35183451_r2_38 


1439 3359 


261 


786 


259 


3.1e 


-22 


Protein name 








Locus Name 




Acc# 


hypotnetical protein ]Jip0628 




prr :B71907 


\ . B71907 


Description 
















ORF Name 


. NTID AAID 


NT 
Length 


AA 

. — . , : Score 
Length 


Probability 


4147537_G3__i20 


1440 | 3360 


990 ' 


2973 


3281 


0.0 


■ ' . 1 


Protein name 








Locus Name 




ACC# 










sp:UVRA 


_EC0LI 




P07671 :P76 

788 : 'f 


Description. 














EXCINUCLEASE ABC SUBUNIT A • . , " * c " - : 


ORF Name 


NTID AAID • • 


NT 
Length 


AA h ■ 
„ — ■'; , Score 
Length 


Probability 


4199006_£3_56 • 


1441 . j 3351 


69 - - 


210 




[0.022 


Protein name 








Locus Name 




Acc# 


. NADH- dehydrogenase 


subunit, 4 






gp:AF026170 




AF026170. 


Description 
















Teius teyou NADH dehydrogenase subumt ' 4 (ND4) 


gene, partial 


cds ; and ' 


t RNA!r His,' t RNAr Ser, 


and tRNA-Leu genes, complete 


sequence, mitochondrial, genes 


for mitochondrial products. 














\ ORF Name 


NTID AAID 


NT 
Length^ ' 


AA * 
— = . Score 
Length' 


Probability '' . ' 


|4328431 13 65 


| 1442 | 3362 


305 


918 


542 


3.2e 


-52 ' 


Protein name 








Locus Name 




Acc#. 










sp ; : FPGJTEIME 




.." P55044 


Description 
















gLYCOsYLASE) 


ORF Name 


NTID AAID 


NT 
Length" 


AA Score 
Length 


Probability 


4B67812_c3_118/ 


1443 ' 3363 


154 


.465 


318 


|1.8e 


-28 


Protein name 








Locus Name . 




Acc# 










sp:YIHZ 


_EG0LI 




P32147 


Description 
















HYPOTHETICAL 15.9 


KD PROTEIN IN RUN- 


-FDHE INTERGENIC REGION (014b) 





371 



ORF Name 



NTID 



AAID 



892177 cl 70, 



NT AA 
Length Length . 
T£"3 — 



Score Probability 
|7'.4e-30 . 



Protein name 



Description 



Locus Name 



|gp:D833fi6~ 



Acc# 
D83386 



Shewanella vioiacea rhlE, cydD, cydC and putA genes, partial andcomplete 



cds 



ORF Name 



NTID 



AAID 



1684-7336 t3 5 



NT AA - 

Length Length 
177 



Score 



Probability 
7.3e-62 



Protein name 



Locus Name 



DNA- directed RNA polymerase alpha chain 



|gp:AP047025 



Acc# 
AF047025 



Description 



Pseudomonas aeruginosa ribosomal protein S4 (rpsD).gene, partialcds ; 
DNA- directed RNA polymerase alpha chain (rpoA) , ribosomal large subunit 
protein L17 v (rplQ) , and catalase isozyme A (katA) genes, complete cds; and 
bacteriof erritin (bfr) gene^ partial cds . 



ORF Name 



NTID AAID 



' NT AA 1 
T — ^. — ' Score Probabil ity- 
Length Length — - 1 — — u - 



16975442 c3 13 



• [7TD- 



|8.-0e-33- 



Protein name 



Description 



Locus Name 



sp :-YPCM_ECOLT 



Acc# ; 

P76 93 8:P76, 

497 ' . . 



HYPOTHETICAL 21.1 Kb PROTEIN, IN t'ABB-MEPA IM'URGENIC REGION 



ORF Name 



NTID . AAID 



24226655 12 3' 



NT. AA 

r ^^v, t "v. Score 
Length Length 



Probability 
J [499 | |1.2e-47 ~ 



Protein name 



Locus Name 



DNA- directed RNA polymerase alpha chain 



Description 



E 



p :AF047025. 



Acc# v 
AF047025 



Pseudomonas aeruginosa ribosomal protein S4 (rpsD) gene, partialcds ; ! ~ 
DNA-directed RNA polymerase alpha chain (rpoA) , ribosomallarge : subunit 
protein L17 (rplQ) .', and catalase. isozyme. A. (katA) genes , . complete cds; and 
bacteriof erritin (bfr) gene, partial cds.' 



372 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


243.17bul__t2_4 v 


11448 


3368 


83 


252 




354 


2.7e 


-32 


Protein name 










Locus Name 




Acc# 












sp:RL17 


_PSEAE 




052761 


Description 




















50S RIBOSOMAL PROTEIN LI 7 . 


ORF Name , 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


3001693_r2_2 


(1449 


(3369 


IF 7 1 


|654 




683 | 


|3 . 7e 


-67. ' 


Protein name 










Locus Name ' 




Acc# ■ ■ 


ribosomal protein S4 


pir :A64095 




. A64095 


Description 




















ORF Name 


' NTID 


AAID . 


NT ■ 
Length 


Length 


Score 


Probability 


4867143_cl_9 


' 14B0 


3370 . 


191 j 


576 




314 


4.7e-.28 


Protein name -i 










Locus Name . 




Acc#' 


probable translation' tactor ycio - 




pir :F64 8 74 




F64874 


Description 




















ORF Name 


NTID 


AAID ' 


NT V 
Length 


AA ' 
Length 


Score . 


' Probability 


6033377_c3ji4 


1451 


3371 


94 | 


|285 




84 


6.035 ,\ ; 


Protein name V ;■■ , 










Locus. Name 




Acc# - 


hypothetical protein C34F6 


9 






pir :T19736 . 


T19736 


Description 




















ORE Name 


NTID: , 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


10437517_cl_70 - 


1452 


3372 


62. 


189 ■ 










- i ■ 
Protein name ' 










Locus Name 




Acc# 



Description : 
NO -HIT . 



373 



ORF Name 


NT ID . 


AAID 


\ NT ' ," 
Length 


AA • 

„ — . , Score 
Length 


. Probability 


1 1 305575 c2 83 


|14 52 


3373 


72 


219 






Protein name • 










Locus Name 




Acc# 


Description 












i. ■ ■■ 




NO -HIT ■ * 


ORF Name 


NTID . 


AAID 


NT : 
. Length 


AA 

„ — . , . Score 
Length 


Probability 


1359S77_r2_17 . ■ 


(145,4 


3374 


373 | 


|II22 | 96 9 


1. 8e 


-97 . 


Protein name 










Locus Name 




ACC# : 


uroporphyrinogen decarboxylase ' ! 




gp: ECOUW89 . . ' 


U00006 


Description ; . 














E . coli chromosomal 


region 


troirr 8 9 


2 to 92 . 8V; minutes : 






" ORF Name 


NTID/ 


AAID 


NT * 
, Length 


AA . ' 
— ■• Score 
Length 


' . Probability . 


i4898317_c3_94 ■■ 


1455 


3375 


. 551 


1776 |17,S9 | 


|3 . le 


-182 


Protein name 




r. i 






Locus Name 




Acc.# 












sp.isyDiECOLr 




P2188.9 


Description 
















: (ASPRS) ' • ' . 


ORF Name , ." 


NTID 


AAID 


NT - 
Length 


AA- 

r — t . Score 
Length • \ 


Probability 


16522206_cl_56 . 


|I45* - 


3376 


207 | 


r 


24 .• „ Y 






Protein name, , , 










Locus Name 




. Acc# 


Description 
















NO -HIT : - t ' 


ORF Name« 


NTID . 


AAID . 


NT "\ 
Length. 


AA 

. — Score 
Length • 


Probability 


i5Si4042^c3_107 


|1457 


3377 • 


125 | 


|378 142 . , 


. 7 . 9e 


-10 


Protein name 










Locus Name 




• Acc# • 


hypothetical protein slrl903 '■ 




pir: 577514 


S77514 



Description 



3 74. 



ORF Name 



NTID 



AAID 



T7WT1 T3 T2 



NT AA 
Length Length 
1442 | " 1132 9 



Score Probability 
[1020 | [7T2e-103 



Protein, name 



Locus Name 



glycer aldehyde- 3 -phosphate dehydrogenase ; 



|gp:AF058302 



Acc# 
AF058302 



Description 



Streptomyces roseolulvus trenolicm biosynthetic gene cluster , complete 
sequence. ... . 



ORF Name 



NTID 



AAID 



20984532. ci 68 



3379 



NT AA 
Length Length 




; — Score Probability 



TFT" 



Protein name'. 



Description 



Locus Name 



Acc# 



NO-HiT . , 


■ORF Name NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability' 


2142i51_r3_38 • 1450 3380 ■. 


252 


759 421 


2.1e-39 


Protein name 




Locus Name 


■. Acc# 


.anion transport, ABC transporter (ATP 
homolog ytlC . 


-bmdi)* 


' pir:C69995 


C69995 







Description 



' ORF Name 



NTID 



AAID 



23437558 12 24 



TOT 



Protein name 



3 -phosphoserine aminotransl erase 



Description 



NT AA • 
— — — Score 
Length Length — — 



TuT7~ 



Locus Name 



;.. |gp:AF038578. 



Probability 
5.0e-95 ' 



Acc# 

AF038578 :M 
73971 :M355 
45 



Pseudomonas stutzen gyrase A subunit (gyrA) gene , partial : ' . ' . 

cds ; 3 -phosphoserine* aminotransferase (serC) , chorismatemutase/prephenate 
dehydratase (aroQp/pheA) , imidazole aeetolphosphate aminotransferase (hisHb)., 
and cyclohexadienyldehydrogenase ( tyrAc) genes, complete cds; 
and5-enolpyruvylshikmate 3-P synthase (aroF) gene, partial cds. , 



3 75 



ORF Name 


- NT I D 


AAID 


NT , 
Length 


AA 

, — . , Score 
Length 


Probability 


ni^^aaa — ft — 7 


1 4 £ ? 




- g.5 


158 




Protein name 








.. Locus Name 


Acc# 


Description 












MO-HIT . 


ORF Name 


NT I D 


AAID 


NT 
Length 


AA ^ ^ 
, — . , Score 
Length 


Probability 


23542875_t3_39 


1453 


- 3383 




758 




Protein name 








Locus Name 


'■ Acc# 


Description 












NO- HIT • \ / •* . 


ORF Name 


NT ID 


AAID 


NT 
Length 


AA '> • • 
— , ' Score 
Length . 


Probability 


24I03I37_t2_l-6 . 


| 1454 


| |3384 


409 - 


1230 |1079 | . 


|4.0e-109 


Protein name 








. Locus Name 


Acc# 



Description 



sp:YHB^_HAE!N 



P44915 



: HYPOTHETICAL 43.4 


KD GTP- 


-BINDING 


PROTE IN HI 0 8 77 








ORF Name 


NTID 


;. AAID 


NT AA 
Length Length 


Score ' 


Probability 


24272135_c3_i03,. 


| 145 5. 


3385. 


' 174 , 525 - ■ 




295 


4r8e-25 



Protein name 



Locus ' Name 



Lrp-lamily transcriptional regulators 



|gp:D89015 



Acc# 
D89015 



Description 



Pseudomonas putida 


genes 


for MdeR , MdeA and. MdeB , complete cds. 




ORF Name 


NTID 


' NT AA ■ ■ 

AAID — -. , , . Score 

- •• Length Length 


-Probability 


244i0638_r2_i9. 


| 1455 


|3385 443 . 1332 . 743 


1.5e-73 


Protein name 






'-Locus Name - 


Acc# 


proteinase DO ■ 1 - 


pir:H71935 


H71936 



Description 



376 



ORF Name 



NT ID AAID 



TTST" 



NT ' AA " 
Length Length 
304 



Score 



Probability 
3.7e-51 



Protein name 



Locus Name 



sp : YJJ&^ECOLI 



Acc# 
P39402 



Description ■ ■■■ _ - . 

HYPOTHETICAL 30. '5 KD PROTEIN IN DNAT-BGLJ INTEPGENIC UEGION (F2 77) 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA , n 
. , — . v Score 
Length 


Probability 


25665963_tl^S 


|I4SS 


3388 . 


264 


|795 | 444 


7.8e-42 


Protein name 








Locus Name . . . 


Acc# • 










' sp : GL02_ECOLI 


Q47677 


Description 












II) (GLX II) 


ORF Name 


NTID 


AAID 


NT 
Length 


AA . 

— Score 
Length 


Probability 


2757750_13_47 


1459 


3389 


72 


219 [73 . | 


|0.016 



Protein name . . 

Description , 
plasmid ColIb-P9 DNA, complete sequence. 



Locus Name 



E 



p:AB021078 



Acc# ) 
AB021078 



ORF Name 


1 NTID 


AAID 


NT ". ■; 
Length 


' AA ' a 
T ~ ~. i Score. 
Length 


. Probability 


31412958_t2_23 


|1470 


3390 


250 - 


753 . , 




Protein name 








Locus Name 


. Acc# . 


Description 












NO -HIT ........ ^ .. ... 


ORF Name . 


NTID 


AAID 


NT 
Length 


AA 

— ■ Score 
Length' 


Probability ' - 


33394002_12_30 


|1471 


3391 


507- 


|1524 , |79 | 


|0.036 



Protein name 



Locus Name 



cytochrome -c oxidase, chain I RP4 05 



[pir :D71598 



Acc# 
D71698 



Description 



377 



ORF Name' 


NTID 


AAID 


NT- 
, ! . Length 


AA ' 

— . , Score 
Length - ■ • 


Probability 


|35iB1680_c3J)5' 


| 1472 


. 3392, 


| - 356 , 


1071 257 


. 4.5e>23 


Protein name . r 










Locus Name 


Acc# 












gp:PPY14558 

. j ' - ■ r- 


Y14568 


Description 














Pseudomonas tluorescens tag gene 


and partial glyQ, htrB genes. 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


|4009750_cl_69 


| 1473 


\P' m 


1 pi 


|555 |279 


| p.4e-24 


Protein name. 










Locus Name 


■■ . Acc# ' 


hypothetical protein ; 


pir:57S551 


S76551 


Description. 












•• ■ ' 


ORF Name 


NTID 


AAID 


'NT 
Length 


AA \ o . 
, ^ , Score 
Length .. 


Probability . ' 


4165952_t3_34 


| 1474 


339.4 


| 82 


249 




Protein name 










Locus Name 


.Acc# 


Description 














NO -HIT ; • ; ' • ; • . " ; " • 


ORF Name 


NTID 


AAID 


NT.' 
■. , • Length 


AA • 

T — . i" ■ Score 
Length , 


Probability 


4328443^ci_74 • 


.1475 


3395 


176 


531 193 


3:le-15 ; 


Protein name 










Locus Name 


Acc# .;■ 


hypothetical prote 


m * '. ■ 






pir:G75479 


G75479 .. 


Description 














ORF Name 


NTID 


.. AAID 


NT , 
' r Length 


AA ' . 
.— , Score 


Probability 








Length, » 




|4423193^c2_79 • 


| 1475 


3395 


• | 85 , 


•258 |87 


| '|4.6e-07 


Protein name 




Locus Name 


Acc# ' ' 



sp:ARUD_ARCFU 



03.0156 



Description , 
ACETYLQRNITHINE AMINOTRANSFERASE " (ACOAT) 



378 



ORF Name' 



NTID. 



AAID ' 



14854077 c2 7U 



PT7T 



U5T 



NT : AA 
Length Length 

rr. — 



Score 



T5T 



Probability 
i.4e-10 



Protein, name 



Locus Name 



unknown 



bp:AP0625.31 = 



Acc# 
AF062531 



Description 



: Pseudomonas put 1 da 


GB-l, signal peptidase (pilD) gene, 


partial 


cds ; and 


unknown genes. 








ORF Name 


NT AA 
NTID AAID — . — , 
Length ■ Length 


Score 


Probability 


4878407_tl_6 


|1478 3398 | 589 [1770 | 


1355 | 


2.3e-138 



Protein name 



Description 



Locus Name 



sp : LEU1_YEA5T 



Acc# 
P06208 



SYNTHASE) 


(ALPHA- I PM SYNTHETASE) 












ORF Name 


NTID AAID 


NT . 
Length 


AA, 
Length 


Score 


Probability 


b085963_tl_ 


11 • - • 14') 9 | 3399 


|243 


7 32 


124 | 


|3:2e-13. . j 



Protein name. 



Description 



Locus Name 



Sp, : YDFN_BA(JSU 



Acc# 
P96692 



PUTATIVE NAD 


(P)H NITROREDUCTASE 


YDI-'N, 










ORF Name 


NTID ' AAID" 


NT 
Length 


AA 

, — . , ■ Score 
Length 


Probability 




1480 . ■'. 3400 


330 


993 585 


9:0e 


-57 , 


Protein name . 








Locus Name 




Acc# • 


hypothetical 


protein TM0484 






pir:C7236.9 


i t 


C72369 


Description 












v. • 


ORF Name 


NTID AAID 


'•• NT 
Length 


AA . ... 
. — , Score 
Length 


' Probability 


5256588_t2_29 


1481 . 3401 


878 


2537 12754. 


|1.3e 


-286/ - 


Protein name 








Locus Name 




Acc# 


UspAl ■ , • V , • , 




gp:AP11360S ■. 


AF11360.6 



Description i . 

Moraxella catarrhalis strain ATCC2S238; UspAl 



(uspAl J gene, completecds . 



379 



ORF Name 



NTID 



AAID 



802137 ti T7 



NT 
Length 
1251 * ; 



AA 

x Score 
Length 



7ST 



Probability 
2.0e-43' 



Protein name 



Locus Name 



ABC . transporter, permease protein, cysTW 
family. . -. 1 .. . ■ • 



bir:E>7236.9 



ACC# . 

D72369 



Description . 
ORF' Name 



NTID AAID 



894387 c2 80- 



NT 
Length 

nrsT5 



AA 
Length 

Rnn — 



Score 



ITT 



Probability 
6.0e-28 



Protein name 

Description 
HYPOTHETICAL PROTEIN HI 01 08 



Locus Name 



sp:YJJP_HAUlN 



Acc# . 
P44520 



ORF Name 


NTID . 


AAID 

. .f 


NT 
Length 


AA • n 
— : , Score 
Length . 


.Probability 


976bby_t2_18. ; 


| 1484 


3404 


ir 


186 | 




Protein name . 






V 


Locus. Name 


Acc# ■ 


Description 












NO -HIT / v . ■ y ,. ; r . 


ORF. Name 


NTID 


AAID 


NT 
Length 


AA 

. '■— j , Score- 
Length 


■Probability 


10740682,. c2 12 ■ 


1485- 


34 05-; 


29.7 


894 | |678 . | 


|1.3e-66 


Protein name 








Lociis ' Name 


, / Acc# 


probable acyr-rCoA.' dehydrogenase 




. pir:B75282 . 


B75282 .. 


Description 


■ '< 










. ORF Name ■- 


\. NTID 


AAID 


NT 
Length , 


AA 

— Score 
Length , 


Probability 


16829202J:3_8 


I486 


3406 


. 251 


753 185 - 


• 2 ..2e-14 


Protein name 


V- J. 






Locus Name., 


ACC# . 



sp : PABC_ECOLI 



.P28305 



Description • , • . , . 

4 -AMINO-4 -DEOXYCHORISMATE LYASE , (ADC LYASE) 



380 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


25421HO / 1 1 Z 


14« / 


.34 U / 


loo 


567 






Protein name 








Locus Name " 


Acc# 


Description 














MO -HIT' 


ORF Name i, - 


NTID 


AAID 


NT. 
Length 


AA 
Length 


Score . 


Probability 


341609i8_i3_7 - 


1488 


3408 




195 






Protein name 








Locus Name 


■ ' . Acc# 


Description 














NO-HIT , 


ORF Name . . 


NTID 


■AAID 


NT , 
Length' 


AA 
Length 


Score 


Probability 




1489 


|3409 . 


■•275 


I 828 


475 


|4.1e-45 


Protein name . 








Locus Name 


Acc# 


shikimate dehydrogenase 






|gp:NPU82846. 


U82846 


Description 














Neisseria pharyngis 
cds. 


var . 


tlava shikimate dehydrogenase (aroE) 


■ gene , complete 


ORF, Name • 


NTID 


AAID 


NT ; 
Length 


■ : AA 
Length 


Score 


Probability 


12156 514 cl_l 6 


1490 


3410 


; 162 


489 


336 


2.2e.-30 


Protein name 

• — : ' 1 








Locus Name ' 


Acc# ' 










sp:&3TA_EC0LI 


P52108, 


Description 














TRANSCRIPTIONAL REGULATORY PROTEIN 


RSTA \ 








ORF Name;. 


NTID 


AAID 


NT 
Length 


AA. 
, Length 


Score 


Probability. 


15625078_cl_19 . • 


1491 


3411 


179 ■ 


537 ■ 


|444 | 


|7..tte-42 



Protein name 

Description .- 
METHYLTRANSFERASE ) 



Locus. Name 



sp : TRMt)_SERMA 



Acc# 
P36244 



381 



, 1 

ORF Name . NTID AAID 


NT 

Length,. 


• : ,'(' 
AA 

v — L1 Score 
Length 


Probability 


23468928_cl_18 1 1492 3412 


191 


576 295 


4.8e 


-26 - 


Protein name 






Locus Name 




, Acc# 








sp:R!MM_HAEIM 




• P44568 


Description " . ' . • . 












16s rrna processing protein. rimm. • • - 


- ORF. Name . NTID . AAID 


NT 
Length 


AA 

T — Score 
Length , 


Probability 


238b9377J:l_4 : | 1493 3413 ; 


502., | 


1509 , 525 | 


2 . Oe 


-50 


Protein name 






Locus Name 




Acc# 


EnvZ protein " . .: 


gp : YEOMPR . 


Y0 895 0 


Description 






Y. enterocolitica ompR and envZ genes . * 


ORF Name NTID AAID- . 


NT. 
Length 


AA _ 
_ — , . . Score 
Length : 


Probability 


396I587_c2_2i; .1494 3414 


86 


. 261; 279 


2.4e 


-24 


Protein name . '. . - ' .- 






Locus'. Name 




Acc# 








sp:RS16_HAEIN 


P44382 


Description 












-30S RIBOsOMAL PROTEIN S16 V 


. ORF Name NTID . AAID 


NT- 
Length 


AA 

J — . , • Score 
Length. 


Probability 


964692_c3_22 1495. | |3415 . 


598. 


11797 1 442 


1.3e 


-41 


Protein name i. . , , 






Locus Name ■ 




Acc# 








sp:kSTfe_ECOLI 




PI 8 3 92' 


Description . r 












SENSOR PROTEIN RSTB, 


ORF Name NTID ' AAID 


NT 
Length 


AA 

. — . , . " Score 
Length . t 


Probability 


1057S257_r2_68 1496 3416 


393 


1182 .. 1212 


3.2e- 


-123 


Protein name 






Locus Name 




Acc#. 








sp:PUR9_HAEIN 


P43852 



Description 



3 82 



ORF Name 


NT I D 


AAID 


NT ' 
Length 


AA 

, — . ., , Score 
Length ' - 


Probability 


10736257 t 3 80 


1497 


3417 


6 b 






Protein name , 








Locus Name 


:..ACC# . 


Description. 












NO-HIT - . - ; : • 


ORF Name 


NTID - 


AAID 


NT 
Length 


. AA n 
, — , Score 
Length 


Probability 


125S6337_tiJJi • 


145.8 


3418 


128 


387 . 477 


2 . 5e : 4b 



Protein name 



Description 



ORF Name 



Locus Name 



sp : PUR9_E00L1 



Acc# ,. 
P15639 



NTID 



AAID 



1272283. 12 60 



1499 



NT . AA 
Length Length 
183 ' l' 1552 



Score 



Probability 
7.5e.20 ; - 



Protein name 
Description ■ 

UBIQUINONE BIOSYNTHESIS PROTEIN AARP 



Locus Name 



]sp:AARF^ECOLI 



Acc# 

P ! 27854 :P27; 
855 :P76764 
:P2 78 53 . 



ORF Name 



NTID AAID 



131700 cl 108 



T"5"0u~ 



3420 



NT . . AA 
Length Length 
21T5~ "I 1758 



Score 



?T5~ 



Probability : 
1.4e-16 " " - 



Protein name 



Locus Name 



putative peptiayl -prolyl cis-trans isomerase | Igp : ASAJ2316 



Acc#, 
AJ002316 



Description 



Acinetooacter sp. ADP1 alkK & alKM genes, ORF1 ORF4 . : 


NT AA 

ORF Name ' NTID AAID ' , — : . , \ — , Score 

Lenylh Length 


Probability 


13876562, cl 128 1501 3421 | |75 . | 228 - | |73 


|P.-0H 



Protein name 



Locus Name 



immunoglobulin kappa light chain variable 



gp:AF13ii56 



Acc# 
AF131156 



Description 



Mus musculus immunoglobulin 'kappa light chain variable region gene,partial 

cds . " . . " • 



383 



ORF Name ■ ' 


NT ID 


AAID 


NT ' 
Length 


AA 

r — L , Score 
Length 


Probability 


13947127_c3_217 


1502 


2422 


584 ... 


1755 1216' 


i.6e-163 ; \ 


Protein name 








Locus Name . 


. Acc# 










sp:^Y0_HAEIN 


P43831 V 


Description 












. (GLNRS) ' . 




ORF Name 


NT ID 


AAID 


NT ■■■ 
Length 


AA 

„' , Score 
~: Length 


Probability 


I4Bb2-03b_clJL29 


. 1503 




85 1 


258 . |70 | 


|0.03> 



Protein name. 



Locus Name 



tat protein 



Description 



HIVU86775 



Acc# 
U86775 



HIV-1 clone ZAM184-5.2 trom Zambia, [ tat protein (tat) . gene, partialcds, rev 
protein (rev).,, vpu, protein (vpu) , and envelopeglycoprotein (env) genes, 
complete cds and nef 7 protein (nef ) pseudogene . . 



ORF Name 


NTID ' 


AAID 


', NT 
Length 


AA Score 
Length 


Probability 

. - -. ■ j- - 


15663417_12_42 ; 


: 1504' . 


; 3424 ' 


79 




240 




Protein name 










Locus Name ■ 


ACG# 


Description 














NO-HIT . . .. ; , , - ; . • ' 


. ORF Name ■ . 


NTID, 


. AAID 


,nt' 

Length . 


AA 

, — /, Score . 


Probability 


16583425_ci_131 


; 1505 


3425 


326 




981 |535 | 


.|l.Be-bl 


Protein name 










Locus Name 


Acc# 


ytjB protein 










pir:B65040 


B65040 


Description 














ORF Name- 


•NTID 


AAID 


NT , 
, Length 


AA 

i — -. , Score ■ 
Length 


..Probability 


|19632665 ' c2 160. 


1505 


3426 


696. , 




2091 633. 


|2.3e-79 


Protein name 










Locus Name 


.. Acc# 



'sp : COPA_EMTHR 



Description ., 



P32113 :Q47 
841 



ORF Name 



NTID . AAID 



T5W 



NT 
Length 
215 



AA 
Length 
F5T — 



Score 



Probability 
14 . 8e-0« 



Protein name 

probable component or cation transport tor 
cbb3 -type oxidase 



• Locus. Name 



pTrTETTSTT 



Acc# 
E71813 



Description 

. QRF Name ' 
[21753552_c3_220 

Protein name 
Description 
'INO-HIT — ' — 



NTID AAID 



1503 



NT 
Length 
1168 ' 



■ AA 
— Score 
Length - — — — 



Locus Name 



Probability 



Acc# 



ORF Name 


NTID 


AAID 


\ '. NT 
:. Length 


AA , * 
T ~, i Score 
Length 


Probability 


2197952 tl^ll. 


1509, 


3429 


122 


359 ,. 




Protein name 








i- . 

Locus Name 


Acc# . 


Description 












no-hit - ,. , 


ORF Name 


NTID 


AAID 


' • NT 
Length 


AA "• 

— Score . 
Length 


Probability 


22145253_c2_177 




| P 450 


[210 


r 633 * |592 


1.6e-b7 • • .. | 



Protein name 



Description 



. Locus Name 



sp:OEN_HAEIN 



Acc# 
, P45340 





ORF Name 


NTID 


AAID 


' NT 
Length 


AA 
Length 


' Score 


, Probability - ■ 


22272900J:1_5 


1511 


3431 


227 


6S4 . 


263 


l,2e-22 . 



Protein name 



Locus Name 



- hypothetical protein 



gp:PST243354 



Acc# 
AJ243354 



Description 



Pseudomonas stutzen nypl and comA genes and 
exbD genes .■■ 



putative tolQ, exbB,tolR and 



ORF. Name 



NTID 



AAID 



22285902 c3 212 



NT 
n 



T — ^ Score 
Length Length . — — 



Probability 
1.8e-26 * 



Protein name 



Locus Name 



transposase sir2062 :protem slr2062 : protein " 
slr2062 



pir :S74909~ 



Acc# 

S749.09 



Description 



ORF Name 



NTID 



AAID 



NT AA 
T — r Score 
Length Length — — 



22710402 c2 154 



T5TT" 



[78 



[2T7~ 



Protein name 

Description 

[NO-HIT 



Locus Name 



Probability 



ACC# 



ORF Name 



NTID 



AAID 



23457632 13 90 



T"5TT" 



NT . AA 
Length Length 
295 ' 



Score 



FF5~ 



Probability 
l,5e-88 " 



Protein name 



Description 



Locus Name 



|sp:UBIEJjCOLI 



ACC#.; 

P2 7851 



(EC 2.1.1.-) " . 


ORF Name' 


NTID 


AAID 


NT ., 
Length 


AA 
Length 


■ Score 


" Probability 


23475002_11_9 


. 1515 


■ . 3435 


1 182 1 


549 


, 121' 


, , 4.8e-05 



Protein name 

■ . ■ ■ y ~ 

Description 
COPPER HOMEOSTASIS 



Locus Name 



sp:CUTF_ECOLI 



• t ACC# 
' P40710 



PROTEIN CUTt' PRECURSOR .(LIPOPROTEIN NLPEJ 



ORF Name 



NTID 



AAID 



|23Sb4^76 11 15 



. NT AA n 

Length Length 
1384 . | 11155 



Score 



Probability 
4 .le-68 



Protein name \ , 

Description 

UBIQUINONE B I OSYNTHESIS PROTEIN AARF 



Locus Name 



sp:AARF_>!COLI 



Acc# , 

P27854 :P2? 
855 :P76764. 
:P27853 



386 



ORF Name NT ID AAID 


NT 
Length 


AA 

, — ^ , Score 
Length 


Probability 


23634656_c3_200 1517 3437 


97 


294 ' 147 


2.3e 


-10 


Protein name 






Locus Name 




Acc# 








sp:YEAC_ECOLI 


P76231 


Description 












HYPOTHETICAL 10.3 KD PROTEIN IN ANSA-GAPA INTERGENIC REGION .. 


ORF Name V NTID . AAID , 


NT 
Length 


AA 

„ — , Score 
Length , 


Probability 


p4015550_cl_.147 • | |15i8 |3438 


1 P 1 


|606 | |207 | 


|1.0e 


- ib ' 1 


Protein name 






Locus Name 




Acc# 


hypothetical protein 




gp : AF157493 




AF157493 


Description 












Zymomonas mobilis ZM4 tosmid clone 


42D7, complete sequence. 






ORF Name NTID AAID 


NT 

Length 


AA 

„ •• — . Score 
Length 


Probability 


24259702_tl_10 1519 3439 




1503 275 _ 


2 .4e 


-23 ' . 


Protein name. • , : 






Loeiis Name 




. Acc# 








sp:YF46_ARCFU 


02 8 726 


Description 












HYPOTHETICAL PROTEIN AF1546 ' ' ■ < . ... 


ORF Name, NTID . AAID . 


NT 
Length 


AA 

„■ — .., Score 
Length 


Probability 


24303583_tl_30 1520 3440 


93 


282 169 | 


l.le 


12 ' 1 


Protein name 






Locus Name 




Acc# ■ 


small DNA hincting- protein Fis 




gp-.AF0403.79 


" 1 AF040379 



Description 



Proteus vulgaris nbosomal protein Lll methyltransterase (prmA) gene, 1 partial 
cds; yhdG homo log gene, complete cds ; and small DNAbinding protein Fis (fis) 
gene,, partial cds. , 


NT' * AA 

ORF Name NTID AAID — ■ — ■ Score 

LenyLh Length 


Probability 


24306S10 c3 209 - 1521 3441 224 675 443. 


. 1.0e-41 



Protein name . 

Description .. ; 

CARB OX YL ESTERASE ' 2 , (ESTERASE II) 



Locus Name 



sp:EST2_P5EFL 



Acc# 
Q53547 



387 



ORF Name 



NTID AAID 



24513752 c2 165 



NT AA 
.. Length Length 
241 



Score Probability 
|6.2e-8i ~ 



WTT 



Protein name 



Locus Name 



superoxide. dismutase, (Mn) : SodA protein" 



bir: JC6542 



Acc# 
JC6542 



Description 
ORF Name 



24614125 -t2 65 



NTID AAID 

] 



NT AA 
— — Score 
Length Length '-■ — 



12451 



Probability 
1.0e-160 



Protein name 



Locus Name 



penicillin- binding protein IB 



Description 



AF147449 



Acc# 
AF147449 



Pseudomonas aeruginosa strain PAOl penicillin-binding protein lB(ponB) gene, 
complete cds . - . ■ 



ORF Name 



NTID AAID 



24640762 13 79 



1524 



3444 



NT 
Length 
388 



. AA 
Length 
11167 



Score , Probability 
11412 



2 . ie-144 



Protein name 



Description 



Locus Name, 



Acc# 



spiMETKjECOLI. 



P04384 :P30. 
■ 869 



ADEN0SYLT&ANSFE&A5E ) (ADOMET SYNTHETASE) / 


ORF Name. NTID AAID .— , 

Length 


AA 

, '. — . , Score 
Length • 


Probability 


25551385_rl_17 1525 3445 - 175 


528 298. 


■ 2:3e-26 


Protein name 




Locus Name 


Acc# - 


adenine phosphoribosyltransterase , : protein 
sl.11430 : protein slll430 : 




pir:S75440 


S75440- 








Description * , 








ORF Name . NTID AAID ^ , . 

•. Length , 


AA 

, — , Score 
Length 


Probability 


25665962_tl_8 1526 3446. [115 .; 


348 • [75. 


[0.0099 


Protein name 




Locus Name 


Acc# 


glutamyl -tRNA (Gin) amidotranst erase subunit: 

C v ■ ■ ' ;■ t . . 


pir :D70484 


D70484 









Description 



388 



ORF Name 



NTID AAID 



NT 

n 



AA 

t '■ — ' 1 Score 
Length ■ Length 



Probability 
i.6e-41 " 



Protein name 



Locus Name 



hypotnetical protein 



pir -S76006 



Acc# 
S76006 



Description 



ORF Name. 



NTID AAID 



NT' AA 
Length Wth ^re' 



30650250 11 12 



1528 



3448 



2W 



Probability 
3.8e-55 



Protein name 



Locus Name 



conserved hypothetical protein 



pir:A75256 



Acc# 
A75256 



Description 
. ORF Name 



NTID AAID, 



31541442 c3 183 



1529 



13449 



NT . AA 
Length Length 
317 ..' 



Score 



Probability 
|5.5e-30 



Protein name- 



Locus Name 



putative pept idyl -prolyl cis- trans isomerase 



|gp:ASAJ2315 



AccJ 
AJ002316 



Description ' 








Acmetobacter sp . 


ADPl alkR .& alKM 


genes , ORF1 &■ ORF4 . 




ORF Name 


NTID . AAID 


NT - AA 

' —j , Score 
Length Length 


Probability 


32694687_c2_181, 


1530 3450 


140 423 |117 | 


|2.0e-06 ■ 



Protein name 



Description 



Locus Name 



sp:YPBB BAC5U 



Acc# 
P50728. 



HYPOTHETICAL 40.7 


KD PROTEIN IN FER-RECQ INTERGENIC 


REGION 






ORF Name 


NT AA 

NTID AAID , ■ - 

Length Length 


Score 


Probability . 


■ i . ■ 

! 


332I3bLb_c3_215 


| 1531 |3451 60 | 183 


I 106 1 


|1.3e-05 


~ 1 * 



Protein name 



Locus Name 



E 



p:ECU82654 



Acc# 
U82664 



Description ' 
Escherichia coil minutes 9 to 11 genomic sequence. 



389 



ORF Name. 



NTID AAID 



33243927 cJ 21J 



3452. 



NT AA 
Length - Length 
1229 ■ 



Score Probability 



\G0G | |b.3e-59 



Protein name 



Locus Name 



sp:YCFV_ECOLI 



Acc# 
P75957 



Description ' • • 

HYPOTHETICAL ABC TRANSPORTER ATP -BINDING PROTEIN YCFV 



ORF Name , 



NTID AAID 



133394050 13 76- 



NT AA 
Length Length 
269 ' 



Score Probability 
|810 | |340 | |8.2e-31 ~ 



Protein name 



Locus Name 



sp: YBBF_ECOLI 



Description . • 
HYPOTHETICAL 25.9 KD PROTEIN IN PURE-PPIB INTERGENIC REGION 



Acc# • 

P43341 :P77 
440 



ORF Name 


• NTID 


AAID 


NT 
Length 


AA "V ' '■ 
. — Score 
Length 


Probability 


34S5_13_89 


|1534 


3454 


71 


216, 




Protein name 








, Locus Name 


Acc# 


Description , 












NO-HIT 












QRF Name 


. NTID 


AAID . ' 


NT, . 
Length 


aa : 

— • Score • 
Length.,. 


Probability 


39065S1_11_1S ■• 


■ 1535 . 


3455 


|24S ■ 


747 - 311 | 


9.7e-28 


Protein name '. ' 








Locus Name 


• Acc# :. 



Description 



gp : STMBLDA 



M80628 



Streptomyces 


griseus transfer RNA- 


-Leu (bldA) 


gene and ORF, cpmpletecds . 


. ORF ■ Name 


NTID , AAID 


NT- 
Length 


AA 

• Score , Probability . 

Length - ■ .; J 


|3910693_12_39 


1536- 3456 - 

. ■ ' • r 


j 172 : 


|519 | 526,. | 1.6e-b0 


Protein name 






Locus Name Acc# 








sp:CYPB_ECOLI 
. ; 1 P23869 :P78 


Description 






052 . 



(ROTAMASE B) 



390 



ORF Name 



NTID 



AAID ■ 



3944178 ±2 52 



T5TT 



NT : AA 
Length Length 
328 " 



Score Probability 



Protein name 
Description 
[NO-HIT . 



'Locus Name 



Acc# 



ORF Name 



NTID AAID 



3'953218 cl 125 



153S 



3458 



NT AA 
Length Length 
^T3~" 1 12832 



Score 



Probability 
l.le-10 



Protein name 



Locus Name 



PhoC protein 



gp:KPN250377 



Acc# . 
AJ250377 



Description 



Klebsiella pneumoniae partial seip gene tor SelD protein and phoCgene tor 
PhoC protein. ■* 



ORF Name .' , 



NTID AAID 



3991527 12 67 



1539 



3459 



NT AA 
Length Length 
TTT2" 1 16429 



Score 



Probability 
4.0e-51 — 



Protein name 



Locus Name 



Igp : U41852. 



Acc# 
_ U41852 



Description 

Haemophilus intluenzae hst gene, complete cds. 



ORF Name. " 


NTID 


AAID 


NT 

Length 


AA 

, ' — . , ' Score 
Length • ,. - 


Probability 


43227 93_12_5.7 O 


1540 , 




3460 -V - 


217. 




6 54 ' 




Protein, name 












Locus Name . 


Acc# 


Description . 
















NO-HIT . ' 


ORF Name 


1 NTID 


AAID 


NT 
Length 


AA ■ ■ 
T — . Score 
Length 


Probability 


4410943 c3 219- 


1541 




3461 


91 . 




276 |103 | 


|1 .ie-05 • 



Protein name 



Locus Name. 



sp:YGEY ECOLu 



ACC# . . 
Q46825 



Description ■ .''...«• " 

HYPOTHETICAL 10.5 KD PROTEIN TO ELDB-BGLA INTERGENIC REGION 



ORF Name 



NT ID 



NT 



AA 



AAID — . u Score Probability 
— ■ — ~ Length Length — ■ v— -• .. — r— r-^ 

1452 ' 



7.9e-07 



Protein name 



Locus : Name 



metal transporter Nramp4 



gp_:AF2 02540 



Acc# 
AF202540 



Description 

Arabidopsis thaliana metal transporter Nramp4 mRNA , complete cds . 



ORF Name 



NTID 



AAID 



4782812 cl 141 



NT . AA 
Length , Length 
147 



Score Probability 



444 I |95 I |0.,011 



Protein name 



Locus Name 



hypothetical protein TM1026 



pir:A72203v 



Description 



ORF Name 



Acc# 
A72303 



NTID 



AAID 



NT 



AA. 



4798430 cl 151 



; — - — ' . Score Prob ability. 

Length Length - — ; — — — ^ . 



J1544 




3464 




453 . 




1352 


1 



|7.2e-62 



Protein name 

Description ' 
S.cerevisiae chromosome XIII cosmid 9745: 



Locus Name 



gp:5C9745 ; 



Acc# 

Z38114 :Z71 
2 57 



ORF Name 



NTID. 



AAID. 



■NT- 



AA 



Length Length 



Score , , 1 Probability 



5125318__c3_206 11545 .3465 


356: 


1071 184 5.0e 


-21 


Protein name 




Locus Name 


Acc# 


» 




, ,|gp:ATAC007168 


- 'AC007168 


Description ; 








Arabidopsis thaliana chromosome XI 


BAC T26C19 genomic, sequence, complete 


sequence. 


i ■ 






ORF Name NTID' / AAID ■ -, 


NT .' . ' 
Length 


AA 

— r< Score • Probability 
Length : — -■ — . :■ 


5192757_cl_144 .1546 3466, 




|1278 734 |1.5e 


-72 . 


Protein name 




Locus Name 


Acc# • r 






sp,: YC.FW_ECOLI : 


P75958 .' 


Description 









392 



ORF Name 



NT ID 



AAID 



5343752 13 101 



3467 



NT 
n 

ITS" 



AA ■" 

T — Score 
Length Length — 

[6T0~~ 



Probability 
|I.5e-61 " 



Protein name .. . 

Description 
RiBOSOMAL PROTEIN TUT METHYLTRAMS t'EUA^E # 



Locus Name 



sp:t>RMA_ECOLI. 



Acc# -- 

P28637:P76 
680 :P76681 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

% — , Score 
Length 


Probability . 


7042580_ci:142 


1548 




3468 


75 






Protein name 










Locus Name 


Acc# 


Description 














NO -HIT ' ' 


ORF, Name' 


■ NTID 


AAID ' 


. NT 
Length 


AA 

, Score 
Length 


Probability 


7312717_i3_.77 


1549 




3469 


7b 


.228 |77 | 


|0. 028 



Protein name ' ; « 



Locus Name 



conserved hypothetical protein 262, 



pir:S59078 



Acctt 
S59078 



Description 



ORF 'Name 



NTID 



AAID 



822680 c3 218 



T53TT 



NT AA -*•, 
Length Length 
WIT — 71 11254 



Score 



Probability 
8.8e-82 — 



Protein name 



■Locus Name 



•glyceraldehyde- 3 -phosphate dehydrogenase 



Igp : BACPGKTTMG 



. Acc# : 
M87 6.4 7 



Description 



Bacillus megaterium glyceraldehyde- 


3 -phosphate dehydrogenase 




(gap) , phosphoglycerate kinase (pgk) , 


and«triose phosphate isomerase 


(tpi) genes, complete cds. 






ORF Name NTID AAID- 


NT AA 
t — . i V Score 
Length Length 


Probability 


97<S7325__t3_82 . | 1551 3471 


147 | 444 |476 | 


|3.2e-4b j 


Protein name 


Locus Name >. 


Acc# 


transposase homolog A 


gp:HPU55957 


..U95957 



Description 



Helicobacter pylori insertion sequence IS606 transposase hpmologs A(tnpA) ■ 
and.B (tnpB) genes, .complete cds . 



ORF Name 



NT ID 



AAID , 



12635413 ±3 " b 



3472 



NT , ■ AA 
Length Length 
813 • ' I 12442 



Score Probability 
[1030 | |4.2e-129 " — 



Protein name 



' Locus • Name 



sp:Ut>05_ECOLI 



Description , 

UNKNOWN PROTEIN PROM 2D- PAGE SPOTS M62/M63 /03/09/T35 PRECURSOR 



Acc# 

P39170 :P39 
181:P77465 



ORF Name 



NTID 



AAID 



31671880 ti 2 



NT AA 
Length Length 
185 < I 



Score 



Probability 
|6.2e-33 



Protein name 



Locus Name'- 



FabZ 



gp:NMU75481 



Acc# 
U79481 



Description 



Neisseria meningitidis '■• : 1 " ~ — _— — — 

UDP-3 -0- (R-3-hydrpxymyristoyl) -glucosamineN-acyltransf erase (lpxD) gene, 
partial eds , and3 (R) -hydroxymyristoyl ac'yl carrier . pr.ptein . dehydrase (fabZ) 
andUDP-N^acetylglucosamine acyltransf erase (lpxA) genes, complete cds . 



. ORF Name 


NTID 


AAID 


NT 

v ■ Length 


AA 

— , ■ r Score 
. Length 


.Probability 


36148427 ±3' 8 


, 1564 


. 3474 


67 


201 - 




Protein name 








Locus Name 


Acc# 


Description. 












NO -HIT ; • : , • • 


' ORF Name 


NTID 


' AAID 


NT 
. Length 


AA , -■■ 
— , Score 
Length 


Probability 


44i2963_±2_4 


|i-55B, ' 




1 1 


558 ; |470 | 


■ |i.4e-44 


Protein name 








Locus Name - 


Acc# 



Description 



sp:LMA_ECOLl 



P10440 :P78 

243 ' V ' 



CEC 2.3.1.129) ( UDP-N' A(JETy L^LUCOSAMINE . ACYL T kAN& F ERASE Y 



' NT AA 

ORF Name NT ID AAID T Score Probabil: 
' — — —■ Length Length — — 



4687640_t2_3 




1556 


3476 


340 


1023 |667 | 


|i.8e 


-65 . 


Protein name 












Locus Name 




.*' Acc#' 














sp:LPXD_HAEIN 


P43888 


Description 


















(EC 2.3.1.-) 




ORF Name 




, ntid ; 


AAID 


NT 
Length 


aa' 

— Score 
Length . 


Probability . 




iiy / oi^ / 11 z 




1557 . 


3477 


379 


1140 811 | 


l.Oe 


-80 


Protein name 












Locus Name 




Acc#. , 






.- 








sp:YECP_ECOLI 




Description 
















P76291 :O0.7 
983 


HYPOTHETICAL 37. 


0 


KD . PROTEIN .IN ASPS 


-BISZ INTERGENIC REGION 








GRF Name , 




NTID 


AAID 


NT 
Length ' 


AA - n 
. — . v Score 
Length 


Probability. 


14658562_t3_6 




| 1S58 


IP 478 1 


I 505 1 


, 93 0 J835 | 


|2 . ?e 


-83 ' . 


Protein name 












Locus Name 




Acc# 














sp:YEDI_ECOLI . 




Description 
















P46125 :P76 

332 .' 


HYPOTHETICAL. 32. 


2 


KD PROTEIN IN pSRB 


-VSR INTERGENIC- REGION 








ORF Name " ■' ' 




NTID 


AAID 


, NT 
Length 


AA : ■' ', : 
;■ — ... Score • 
Length 


Probability , .' 


23714375_±3_8 




ib59 


3479 


100 


, 303. . |70 r | 


|0.033 


Protein name 












Locus Name 




Acc# 


outer membrane p 


rotem H . 8 


precursor 






pir:S04157 - 


S04157 


Description 


















ORF Name 




NTID> 


AAID 


NT 

Length, 


AA 

Length • - 


Probability 


24253427_r3_7 




1560 . 


3480 


85 


258 • 






Protein name 












Locus Name 




Acc# 


Description 





















NO-HIT 



395 



ORF Name 



NT ID • 



AAID 



24322153 £3 b 



■ NT 
Length 



AA 
Length 
1774 



Score Probability 

4 . le-4b " : " 



47b 



Protein name. 



Description 



Locus -Name 



sp:YECO_HAEIN 



Acc#, 

P43985:P4^ 
986 



HYPOTHETICAL PROTEIN HI0319/32<!!) . 


ORF Name . NTID AAID 


■ NT 
• Length 


AA 
Length. 


Score 


Probability 


24804651_c2jl:S 1552 ■ 3482 


62 


189 




171 


3vle-12 



Protein name 

Description 
5P0R0Z0ITE SURFACE PR0TK1N 2 PRECURSOR. 



Locus Name 



sp:SSP2_PLAY0 



Acc# 
Q01443 



ORF Name 


NTID 


AAID. 


NT 
Length 


AA 

j ~ . 1 Score 
Length 


Probability 


35181955_ci_ll 


1553 


3483 


251 | 


l 75S 1 : 






Protein name 










Locus Name .'; 




, Acc# 


Description 
















nO-hIt \ . ■ ■ \ 


. ORF Name 


NTID 


AAID 


NT 
Length 


AA r 0 
.— ; Score 
Length 


, Probability .' 


3955437_c2_19 


1554 ; 


3434 


73 


219 . 138 


■ 2 . le 


-09* 


Protein name 

— '. , ' — ■ — ! 










Locus Name 




■ Acc# 


peptide methionine 


suit oxide reductase' 




pir:E75345 


E7534.5 


Description 
















ORF Name . 


NTID : 


aaid' 


NT 

. Length . 


AA . ' ' 

■ — ■ Score 
Length 


Probability 


5117337.13 9 


| 


3485 


375, 


1140 1148 


2.0e 


-116 


Protein name 










Locus Name 




■ Acc#V ' ■ 


serine -pyruvate ammotrans 


terase, 






pir:F75259, 


F75269 



Description 



396 



ORF Name 



NT ID ■ AAID 



1053441 c3 193 



T^6" 



Protein name 



hypothetical protein 25 



Description 



NT AA ■ . , ■ ■. ' 
' — . _ ' • — L1 Score Prob abil ity 
Length Length ------ - • . — : — 



TTZT 



TTF 



Locus Name 



pir:Tl3514 



l;4e-2I 



Acc# 
T13514 



ORF Name 



NT ID . AAID 



10650681 c3 219 



1567 



IT5T" 



NT AA' 
Length Length 
102 . 



JUT 



Score Probability 
- [72 1 |1..0e-05 — — 



Protein name 



Locus. Name 



unknown 



[gp" 



: AF050676 



. Acctt 
AF050676 



Description , 



Pseudomonas aeruginosa lipoprotein (op rX) and terric uptaKeregulator ( tur) 
genes, complete cds; and unknown genes.. 



ORF Name 


NTID 


•'AAID 


NT 
. Length 


AA 

_ — . , Score 
Length . 


Probability 


119027_c2_166 


1569 


'3499 


95 




288 ' . V, 




Protein name 










Locus Name 


. Acc# 


Description 














NO-HIT . y •' . . , 


ORF Name " ' 


NTID 


r : 

AAID 


NT ■ 
Length 


AA 

T — . , Score 
Length 


Probability 


1227302_c3_221 : 


1565 


3489 


90 


|273 85. | 


0.012 - 


Protein name 










Locus Name 


Acc#'. ■■ 


probable tatty - 


-acicl--CoA ligase, 


iadD7 , 




pir:C6947i 


C69471 


Description 






.■ ■ 








ORF Name ■ .'' 


\ NTID 


AAID 


NT ... 
Length 


AA . 

■ , Score 
Length 


Probability 


12773910_c2_17i 


1570 


3490 


116 




351 - 





Protein name 
Description 
NO-HIT ■ ~~ 



Locus Name 



Acc# v 



397 



ORF Name 



NTID 



AAID 



NT 
Length 
~ZT ™ 



AA rt 

T ""^Yu Score 
Length " 



Protein name 
Description 
to- HIT 



Locus Name 



Probability . 



Acc# 



|N0- 



ORF Name 
[12993i25_c2 170 

Protein name 
Description 
[NO-HIT r— 



NTID 



■AAID 



TTTT 



NT 
Length 
T5f— 



■ AA 
Length ' 
1459 



Score Probability 



Locus Name 



Acc# 



ORF Name 


NTID AAID 


NT 
Length 


AA " 

— . , Score 
Length 


Probability 


130Bbl60_c3_196 1 


| 1573 | J3493 | 




I 711 1 • 




Protein name " 






Locus Name 


Acc# 


Description 










NO -HIT , 


ORF Name 


. NTID AAID' ;v 


NT 
Length 


AA- - ■ 
— , Score 
Length 


Probability ; : 


ir/1003J>3_87 s 


| 1574 3494 : | 


413; 


1242 [1088 | 


|4,5e-li0. 


Protein name. 






Locus Name 


. ■ ■ Acc# 


Na+/H+- exchanging protein : Na+/H+ .ant iporter 


pir:JXu3;60 - 


JX03 6 0 

"... 'i 


Description 










ORF Name 


NTID . AAID 


NT 
Length 


AA ; 
, — , , Score 
Length. . 


Probability 


14S47033_c3_209 


| 1575, 349b 


192 


579 128 . 


1.8e-07 ■ 


Protein name 






Locus Name 


' Acc# 


muramoyl-pentapeptide carboxypeptidase 


pir:T34747 


T34 7.47 


Description 






' ■ i ' . ' 




ORF .Name 


' NTID . AAID 


NT AA . 
— -■ — Score 
Length . Length 


Probability 


i4745253_c3_214 ■ 


| 1575 |3496 | 


468; | 


|14 07 | |5 96 


S.ie-58 , - 


Protein name 






Locus Name 


Acc# 


hypothetical prote 


in RV3734C 




pir:G70797 


G70797 



Description 



ORF Name 


NT I D 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


15633253 Cl 149 


1577' 


3497 


150 


. 453 






Protein name 








Locus 


Name 


Acc# .; 


Description 














NO-HIT • , 


ORF' Name 


NTID 


. AAID 


NT . 
Length 


AA 
Length 


Score 


Probability 


i6485906_c3JL92 


1578 


• 3498- 


484- ; 


1455 


p7 | 


I2.3e-14 



protein name 



Description 



Locus Name 



sp: VGI7^BPMD2 



ACC# 
064210 



MAJOR 


PROTEIN 


GP17 , 








ii . - 


ORF Name 




NTID 


AAID 


NT 

Length 


AA . 

t* 5 :^, Score 
Length 


Probability 


16595716 t2_ 


69. 


1579 


,3499 


60 


183 ' 




Protein name 










Locus' Name 


. '. Acc# 


Description 


■v 












NO -HIT • " . ;. 


ORF Name-. 




NTID 


AAID 


•NT 
•Length 


AA 

■\ , '.< .Score 
••Length . - •• • 


Probability 


16597827_cl_ 


m 


1580; 


. 3500 


205 


618 |92 | 





Protein name 



Locus Name 



putative prohead protease 



gp:AF181080 



■ Description 



Acc# 
AF181080 



Rhodobacter caps.ulatus putative large terminase, putative portalprotein, ana!" 
putative prohead protease genes, complete cds; andputatiye capsid protein 
gene, partial .cds. " • ■- • ■ 



ORF Name 



NT ID 



AAID 



19547875 cl 157 



1551 



3501" 



NT AA 
Length Length 
124 



1375" 



Score ^ Probability 



U.2e-15 



Protein name 



Locus Name 



mono -heme c-type cytochrome ScyA 5 



gp:AF044582 



Acc# 
AF044582 



Description 



Shewanella , putretaciens NrrG homolog gene, partial cds; anamono-heme c-type 
cytochrome ScyA (scyA) , cytochrome c maturationprotein A (ccmA) , • cytochrome c 
maturation protein B (ccmB) , cytochrome . c maturation protein C (ccmC) , 
cytochrome c maturationprotein D (ccmD) , and cytochrome c ; maturation protein 
E (ccmE) genes , complete cds.' 



ORF Name 



NT ID 



AAID 



19697265 c2 179, 



TITBIT 



T5TJT 



NT 
Length 
— ~ 



; AA 

Length 
\ 1195 



Score Probability 
175 I' 10.020 



Protein name 
Description ■ 



Locus Name 



sp : YC6 7_ASTLO 



ACC# 
P34778 



HYPOTHETICAL 20.1 PROTEIN YCF67 (ORF170) 



ORF Name . 


NTID 


AAID. 


' NT 
. Length • 


AA . 

. ; Score • 
Length 


Probability 


20917082 t3_105 


• 1593.. 


3503' 


71 1 




216 , 


• k ■ • 


Protein name 










Locus Name 


Acc# 


Description . . 














NO-HIT 


ORF Name 


' NTID 


AAID 


' 'NT 
Length 


AA 

■ \ . Score 

Length 


Probability 


2166341Q_tl_i7 




- 3504 


|1V0 








Protein name 










. Locus Name 


. Acc# 


Description 














NO-HIT 


1 












ORF Name . , 


NTID 


AAID'. 


NT ■ 
Length 


AA 

^ — ■ , Score 
Length 


Probability 


22351557_c3JL91 


1585 


3505 


89 




270 [70 | 


[0.0039 , 



Protein name 



Locus Name 



hypothetical protein F26B6.23 



pir :.T01147 



Acc# 
T01147 



Description 



400 



ORF Name 



NT ID. AAID 



22381542 • c3 199 



T5751T 



NT AA 
Length Length 
1257- ;| 



Score 



77T" 



Probability 
3.3e-43 - — 



Protein name 



Locus Name 



minor tail protein L nomolog : protein gpl8 



pir:T13104 



Acc# 
T13104 



Description 
ORF Name 



NTID AAID 



123437551 12 52 



1587 



3507 



NT AA . 
Length Length 
H 12072 



Score 



11985 



Probability 
|4.0e-205 



Protein name 



Description 



Locus Name 



sp:SYM_HAE™ 



J Acc# 
P43828 



(METPS) . . 


ORF Name. 


NTID 


. AAID . 


NT 

Length , 


AA ' 
• : — :. , Score 
Length 


Probability - 


23549217_cl_144 , 


1588 


| 3508 


192 


|579 | |113 


b ;4e- 


-Ob . ■ - . 


Protein name 








Locus Name 




Acc# / 


hypothetical prot 


em 






pir :T14651 / ' 


T14651 


Description 














ORF Name . 


NTID 


AAID 


NT 
. Length 


AA 

T — , , Score 
Length 


Probability . 


p^47257_c2 1^7 • 


: | |i5tt9... 


| 3509 , 


| |Vib 


378 | 






Protein name . 








Locus Name 




Acc# 



Description 



NO-HIT. • . : . - ' 


ORF Name 


. .NTID ■ 


AAID ■ 


NT 
Length 


AA 
Length 


Score 


Probability 


24316886_£3_115 


| [1590 


3510 


ji51 


|45S | 


' 356 


1.7e-32 .. 



Protein \name 



Locus Name 



sp:Yt)CQ_ELmJ 



Acc# 
P7610T 



Description . : . 

HYPOTHETICAL 16 . 1 KD PROTEIN IN TEHH-AN5P INTEPGENIC REGION 



401 



ORF Name 


NT ID 


AAID 


NT 
Length 


AA 

T — , , Score 
Length 


Probability ... 


244i5875_t2_4y 


. 1591 . 


3511 


154 


465 ■■ 




Protein name : 








Locus Name 


Accft : 


Description . 












NO-HIT • - ■ . . \ - 


ORF Name 


NTID 


AAID ; 


NT 
Length 


AA 

„ — : , Score 
Length 


Probability 


24417540_ c2_187 


1592 


3512: . 


:>09 . 


530 |600 | 


|2.3e-58 . | 


Protein name 








Locus Name 


Acc# 



Description 



gpiXCRPt'U. 



Y09700 



X . campestns rp 


tB gene . 










ORF Name 


NTID ... 


AAID 


NT 

Length 


AA i 
T — . Score. 
Length 


Probability 


2443±265_c2_l$2 


J1593 


3513 


_ 485 


' 1451 | |1251| 


|2.1e-128 


Protein name '' 








Locus Name , 


Acc# 










sp : SyC^ECOLI 


P21888 


Description 












(CYSRS) . 1 1 


ORF Name 


: NTID .. 


AAID 


' NT 
Length 


■ AA 

• • — , . Score 
Length . ■ . 


Probability 


24614,431_c2_17i 


' 1594 


3514 


159- 


510 




Protein name 


■ >i - ■ 






Locus Name 


Acc# . 


Description 












NO-HIT . 






j. 






ORF Name ■ • 


.. N.TID . 


' AAID 


NT 
Length 


AA ' " ■ 
— - ■ Score 
Length • . 


. Probability 


24531552_c2_181 


1595- 


, 3.515- 


2 71 ■ 


|815 | 723 


' 2.1e-71 



Protein name 



Locus Name 



tniamin bipsyntnes.is ' protein thiG 
Description 



pir :B70487 



' Acc# ■ 
B70487 



4 02 



ORF Name , » ■ 


NT ID 


AAID 


NT 
Length 


AA 

.„ — ^ Score - 
Length 


Probability 


24882676_c2_16:4 ■ 


; 1596 


. 35,16 . ' 


. 198 * • . 


597. 261 


1.9e^>2 . | 


Protein name 








Locus Name 


. Acc# 










sp:YE18_HAEIN 


P44189 , 


Description 












HYPOTHETICAL PROTEIN HI 14 18 . ' - 




ORF Name 


NT ID 


AAID' 


NT 


— Score 

T .or~i rr t~ h — — 


Probability 


p5397700_cl_I4U 




3517 


| 221 | 


|666 . |3B8 | 


|6.7e-36 | 


Protein' name 








Locus Name 


•Acc# 


minor tail protein gp2 0 






pir:T13106 - ' 


T13106 


Description , 












ORF Name 


NTID 


AAID 


NT 

Length 


AA 

— Score ... 
Length . 


Probability 


25493762_cl_161 ' 


1598 


3518, 




183 




Protein, name 








Locus Name v 


Acc# 


Description . 












no-hit ; ./ . ; .-• • . ' • • , 




ORF. Name 


NTID 


AAID . 


NT 
Length 


AA 

— Score 
Length - 


Probability ; " 


25584<527_c3._2i6 


|1599 ■ - 


■ 3515 




183 




Protein name 








Locus Name " 


Acc# -'■ 


Description 












NO-HIT • -;. ; ' .. ' 1 , 




ORF Name 


• NTID 


AAID r 


NT 
Length 


AA 

, — . , Score 


Probability 








■Length - . 




258i542^-c3_2iJ 


|1600 


3520 


72 


|219 | 





Protein name " Locus Name ' * Acc# 

Description '., 

NO-HIT ' " -, — : ; " ' ~ *' • • ' " '■ " ~ - 



ORF Name' 



126819002 cl 141 



NTID AAID 
11601 



Protein .name 



hypothetical' protein yorB 



Description 



NT AA . 

T ~Vu T ~*_ : u Score 
Length Length = — 



T7T~ 



YTT 



Locus Name 



pir :T12887 



Probability 

10.020 — 



Acc# 

T12887 :C69 
922 



ORF Name , 


' NTID 


AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


276927_t2_S8 i 


,| 1602 . 


.3522.* 


| 330 


' 593 111 


0.0016 | 


Protein name 








Locus Name 


Acc# 










sp : PINQ^ECOLI 


P1880? 


Description. 












F-I.NQ PROTEIN ;. ■ - ■,: • < ....... 


ORF Name 


«■ NTID 


AAID 


NT 
Length 


AA ■ 

' — . Score 
Length ■■ 


Probability 


2792176_c2_180. 


1603 


IF" 


112 


339 | |112 


i.2e-06 | 


Protein name 








Locus Name 


Acc# 



sp : YRK1MJACSU 



P54433 



Description . '. t . ■ ': . 

HYPOTHETICAL 20 . 7 KB PRO TEI N I N BL T R-^POllIO INTERO^IC R E GION 



ORF Name 


■ NTID 


AAID ■ 


NT 
Length 


AA 

T — , i Score 
Length 


Probability 


29337908_li_37 


■ ■ |16 04 


* 3524 ' 


52 


249 | 


\ . 


Protein name 






- " ') 


Locus Name 


Acc# ; 


Description ■ 












NO -HIT 


... ORF - Name . 


NTID 


AAID, 


NT 
Length 


" AA f 
_ — Score 
Length 


Probability 


31678827_c3' 222 


|1605 


| psas 


. 244 


|735 |\ |527 - 


1.3e-50 



Protein name 



Locus Name 



long-cnain- tatty-acid-CoA ligase 



gp: API 5 06 6 9 



Acc# 
AF150669 



Description 



cds 



eudomonas putida long-chain- tatty-acid-CoA ligas.e (ladD) gene, complete 



404 



ORF Name 



NTID AAID 



FT TUT. 



NT 
Length 
\£T~ — 



AA 
Length 
1185 



Score Probability 
[54 I |0.006b ■ 



Protein name 

Description 
HYPOTHETICAL PROTEIN MJ0S83 



Locus Name 



sp:Y683_ME'rJA 



'. Acc# • 
Q58096 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

- — , Score 
Length 


Probability 


^2077!>IJ:3_10i. 


■■ ( J1607 


|3 527 




381 | 




Protein name 










Locus Name 


,. ACC# 


Description 














NO-HIT . 














ORF Name 


NTID 


AAID 


NT 

• Length 


. AA 

V — Score 
Length 


Probability 


35i87543_c3_2i7 






578 . 


1137 .| 105 


0.005.8 


Protein name 










Locus Name « 


Acc# " 


AcLcB protein 










gp:SPADCA 


Z71552 


Description 


Streptococcus 


pneumoniae; adcRCBA 


operon . 








" ORF Name 


■ NTID 


AAID 


NT 

j; Length. 


AA 

, — Score 
Length 


Probability/' 


3937813_cl_ib8 


1509. 


3529 


233 


\m \ pss |, 


|8.9e-34 • 


Protein name ; 










Locus Name 


Acc# 












sp:CYC4jPSEST 


Q52369 


Description 














CYTOCHROME C4 


PRECURSOR 












ORF Name 


NTID ; 


AAID 


NT 
Length 


AA • 
, ■— . , Score 
Length 


Probability 


|402217_c3_194 . 


| 1610 




I 161 1 


485 




Protein name 










Locus Name ./ 


Acc# 



Description 
M6-HIT " 



405 



ORF Name 


NTID . AAID . 


NT 
Length 


. AA 

r — ^ Score 
Length ■ 


Probability 


4069212_c3_19b- 


1611 3531 . 


118 


357 |83 | 


|0. 017 


Protein name 








Locus Name 




Acc# 










sp:YI82Jffl¥JA- 




Q57641 


Description 














.HYPOTHETICAL PROTEIN MJ0182 


ORF Name 


NTID AAID 


: NT . 
Length 


AA 

— , Score 


Probability 






Length 






4331b63_c2_172 , | 


|U1B | |3532 


11179 , 


|3540 | |181 


|1.2e 


-09 


Protein name 








Locus Name 




Acc# 


unknown 








gp:AF0il378 




. AF011378 


Description 














Bacteriophage ski complete genome. 


ORF Name 


NTID '. AAID ' 


■ ' NT 
Length 


AA 
Length 


Probability 


4415938_c2_177 


|161J 3^33 


1627 


4884 |1863 | 


|3 . 4e 


-198 


.... . .. ^ . • • 

Protein name 








Locus Name 




' Acc# . 


tail tip tioer prot 


ein gp21 






pir:.T13107 ; 


T13107 


Description r 














•ORF Name 


NTID AAID 


" NT 
Length 


AA 

„ '■■ — Score • 
Length ■ . 


Probability 


4861263_c2_169 


|1614 3534 


. 121 


366 






Protein name 








Locus Name 




Acc# . 


Description * ; 














NO-HIT " ■ • ; ' : 


ORF Name 


NTID : AAID 


"' NT 
Length 


AA ' ■ 
- — , Score 
Length 


Probability . 


4.8678I9_c2_162 


|1615 3535- 


196 


591 . 404 


1.4e 


-37 


Protein name 








Locus Name 




Acc# 


hypothetical: protein HP13 34 7 




• pir :F64686 




F64686 



Description 



406 



ORF Name 



NTID 



AAID 



513007b c2 186' 



NT AA 
Length Length 
531 "I 11296 



Score 



Protein name, 



Description 



Locus Name 



Probability 
Jl.le-85 

'Acc# 



sp:DFP_HAEIN 



P44953 



DNA/ PANTOTHENATE M ETABOLISM FLAVO PROTEIN HOMOLOG 



ORF Name 



NTID 



AAID 



555437 tl 28 



3537 



NT AA 
Length Length 

m — 1 



Score Probability 



Protein name 
Description 
NO-HIT '~~ 



Locus Name 



Acc# 



ORF Name 



5375032 -c2 175 



.. /' NTID 

E 



AAID 



3"5TS~ 



NT . AA 
Length Length 

im — 



Score Probability 



fluT" 



] [ 



i.le-33 



Protein name 



Locus Name 



minor tail protein gp!9 



pir:T13105. 



Acc# 
T13105 



Description 



ORF Name 



NTID ■ AAID 



682777 cl, 145 



T5HT 



' NT AA 
Length - Length 
TT5 — 



Score Probability 



Protein name 
Description 
[NO-HIT 



Locus Name 



Acc# 



ORF Name ; 



1683187 cl 135 



NTID AAID 
1 [1620 



— . , ■ _ " — . " Score Probability 



NT . AA 

Length Length 

I7T 



Protein name 
Description 
INO-HIT ■ 



Locus Name 



Acc# 



407 



ORF Name 



692S4b2 13 100 



NT ID AAID 
1 [1521 . 



T5^T" 



NT AA 
Length Length 
~H 1207 



Score Probability 
|69 | [0.042 — 



Protein name 



Locus Name 



hypothetical protein APE0740. 



(pir:llV2654 



Acc# 
E72664 



Description 
ORF Name 



NT ID 



AAID 



790807/11 16 



NT AA 
Length Length 
TUT 



Score Probability 



] [ 



Protein name 
Description 
[NO -HIT " — • 



Locus Name 



Acc# 



ORF Name . ; 
63030.0 11 21 



NT ID AAID ■ 



NT AA 
— - — ~~ Score 
Length Length — : 



Protein name 



Description 



Locus Name 



Probability 



Acc# 



NO -HIT : . 1 


ORF Name 


NT ID ' • 


AAID ,; 


■ NT 
Length' 


AA 

,„ — . , Score 
Length 


Probability 


8S5782_c3_198< 




3544 ■ 


750/. 


2253 |173 | 


|5.3e-12 


Protein name 








Locus Name '•- 


• Acc# , 



Description 



' gp:AB030825 



.AB030825 



Pseudomonas aeruginosa genomic DNA, 


partial 


sequence, 


strain 


PA01. 


ORF Name t NTID AAID 


NT 
Length 


' . AA ' , , 
Length' 


Score 


Probability 


14175056_11_2 . 1525 - J3545 


67 | 


,p 0 4 | 


|116 


4.5e.-07 



Protein name 

Description 
A.brasilense carR gene . 



Locus Name 



gp:ABCARRA 



Acc# 
X7.036 0 . 



ORF Name. 



NT ID 



AAID 



125851527 c3 33 



11626 



NT 
n 



AA ;' 
t "~ Score 
Length Length 



12025 



Probability 
1.7e-74 



Protein name 



Locus Name 



protem-disullide reductase . 



gp:AF010322 



Acc# 
AF010322 



Description 



Pseudomonas aeruginosa protein-disultide reductase (dipZ) andcatabolic 
dehydroquinase (aroQ) genes, complete cds . 



ORF Name 


: NTID AAID , 


NT. 
Length 


' AA ' 
Length 


Score 


Probability 


26276%-l_c2_28 


1627 : 3b47 


405 


1218 


|1607 | 


|4.5e-165. 



Protein name 



Locus Name 



chl'oroacet aldehyde dehydrogenase 



gp:AF029733,, 



Acc#. 
AF0297.33 



Description 



Xanthobacter autotropnicus linear plasmid; pXAUl r 
chloroacetaldehydedehydrogenase (aldA) gene, complete cds. 



ORF Name 



NTID 



AAID 



133581289 c2 24 



NT ; , 'AA 
Length ' Length 
512' J I 11539 : 



Score . - Probability 



TTFT 



|4.1e-123 



Protein name • 



Locus Name 



sp:Y736_HAEIN 



, Acc# 
P44849 



Description -, . '■- :'■ <•/, ■ 

HYPOTHETICAL SODIUM- DEPENDENT TRANSPORTER HI0736 



ORF ; Name 



NTID 



AAID 



Score Probability 



5312692 t3 15 



" Protein name 



NT AA 
Length Length 
411 | [1236 | [1075 | pie-108 

Locus. Name ' Acc# 



sodium/proton-dependent alanine carrier pr 
homolog yrbD \ ' ■ . 



. pir:(J69972 



C69972 



Description 
ORF Name 



NTID . AAID 



NT. 



6152307 c2 26. 



T53TT 



Length - Length 
387 I .11164 



AA 

- — . „ Score 



11087 



Probability 
15. Ve-110 



Protein' name , 

, Description 

BD- I OXIDASE SUBUNIT IT) 



Locus Name 



sp,:CYDB_ECOLI 



Acc# 
P11027 



40 9 



ORF Name 



, NTID AAID 



7814.61- c2 2b 



11631 



NT-. 
Length 
,480 • 



AA- 
Length 
11443 



Score 



Probability 
2.1e-16.0 ■ 



Protein name 

Description ' . - - ' 

CYTOCHROME D UBIQUIMOL OXIDASE SUBUTTTT 17 



Locus Name 



sp :-CYDA_AZOV'I 



Acc# 
Q0904? 



ORF Name 



NTID " : AAID 



NT 
Length 



125143 cl 36 



11632 



AA 
Length. 
1249 ; 



Score 



TTT 



Probability 
5.7e-09 



Protein name 



Locus Name 



probable enoyi-CoA hydratase 



pir:G75557 



Acc# 
G75557 



Description 



ORF Name 



. NTID AAID 



12632255 c3 48 



TUJT 



: NT 

Length 
TT9 ~ 



AA 
Length 
1720 



Score Probability 
fTT6~ 



8 . 4e-06 



Protein name 



Locus Name.,:.. 



probable erythrocyte -binding protein MAEBL 



pir:T09127 



Acc# 
TO 912 7 



Description 



ORF. Name NTID 


NT 

AAID - . • . ■ — . _ 
Length 


AA 
Length 


Score 


. Probability ^ 


13064425_tl_6 |1634 


; 3554 160 


14.83 


569 


4.4e-bb Vv- 


Protein name 




Locus Name 


Accfr' ' ' ■ 






sp:HEM6_EC0LT 


P36553 


Description : 










(COPROPORPHYRINOGENASE) 


(COPROGEN OXIDASE) 








ORF Name NTID 


NT 

AAID " =■ — : ■■' 

Length 


AA 
Length 


Score 


Probability 


I6692186_±2_20 - ' 1635 


. 3555- | 158 | 


| 477 


166 


2.3e-12 - 



Protein name 

Description 
CYTOCHROME C 



Locus Name 



sp:CYCP_ALCSP 



Acc# 
P00138 



410 



ORF Name . NT ID AAID 


NT 

Length. 


•.AA 

T ~ -J,' Score 
Length 


Probability 


195277_il_ii 1636 3556 


40S 




1218 491 


8.2e 


-47 


Protein name 








Locus Name 




Acc# . 


ORF396 protein 




gp: PSDMGC 




Z73914 


Description 














fps eudomona s stutzeri ortl75 gene. 




NT 

Length 


AA - 
— , Score 
Length 


Probability 


197137_c2_46 1637 | 3557 


7,10 




2133 965 


|2.5e 


-156 


Protein name 








Locus Name 




?Acc# 










sp:DX3_HAEIN 




. P4 52 05 


Description 








! - ... 






1 -DEOXYXYLULOSE - 5 - PHOSPHATE SYNTHASE 

•- i . . • 


(DXP SYNTHASE) . 






orf Name, i, ■ jniiid . aaid 

,(.-=- , • ... 


NT 
Length 


AA 

_ — , Score \. 
Length 


Probability 


22697263_cl_37 . |I638 3558 


104 . 




315 |8 7 | « 


|0.0021> 


Protein name 








Locus Name 




Acc# 


.probable, enoyl-coA hydratase 






p±r :E70868 \ 




. E70868 , ■■ 


Description ", 














. ORF Name ' ' • NTID . AAID . 


NT 
Length 


AA 

■ — , : Score 
Length 


Probability * 


24323500_tl_5 1639 3559 


171 1 




516 435 


|7.0e 


-41 


Protein name 








Locus Name 




'.V Acc# 










sp:HEM6_EC0LI 




P36553 • 


Description' , 














{ COprOPORPHYrInoGENASE ) (COPROGEN OXIDASE) 


ORF Name / , . NTID AAID 


NT 
Length 


AA 

, — . • Score 
Length 


Probability 


30120325_ci_32 |1640 3560 


103 




312 






Protein name. ' 








Locus Name/ 




' Acc# 



Description 



NO -HIT 



.411 



ORF Name 



NTID 



AAID 



33449042 cJ b4 



NT 
n 
T7Z 



AA 

t — ^ Score 
Length Length : — 



T7W 



Probability 
|6.2e-26 — r— 



Protein name 



Locus Name 



|gp:AF010i35 



Acc# 
AF01013 9 



Description 



Azotobacter vinelandu iron-suitur cluster assembly gene cluster , suhB , 
cysE2, iscS, iscU, iscA, hscB, hscA and fdx genes completecds; ndk .gene., 
partial cds . 



ORF Name 



NTID 



AAID 



33986312 12 16 



NT 
n 



AA 

: — , Score 
Length Length — — 



F3TT" 



522 



Probability 
|4.3e-50 



Protein name 



Description 



• Locus Name 



sp:GCH2_HAUlN 



Acc# 
P44571 



GTP ■ CYCL0HYDR01ASE II, 


: ORF. Name ' NTID 


AAID . 


NT ■ 
Length ■ 


AA 

. — Score 
, Length 


Probability 


35441086_c2^43 : 1643 


■ 3563 


149 




450 |?6 | 


10.011 


Protein name 




[ r 




Locus Name , 




Acc# 


cell wall-binding protein 


homo log ^vcE 




pir:F70031 ' 




F70031- 


Description ■ 














ORF Name . NTID 


AAID 


'• NT, 
Length 


AA 

' . - ( . : Score . 
Length 


Probability 


B859V0.3_cl_33 | 1644: 


.3564 " 


464 . . 




1395 ;.■ |705 | 


|1.7e 


-69 


Protein name 








. Locus Name 




Acc# 










|gp:EC0F0LC 




. 302808 



Description 



E.coli tolC gene encoding .tolylpolyglutamate-dihydrotolatesynthetase, and a 
protein required for its expression, completecds.' 



ORF' Name 



NTID 



AAID 



1046526 ci 177 



fT^T5" 



3565 



NT 
n 



AA 

• •-- , - Score 
Length Length : — —- — 



^5T" 



Probability 
3.2e-29 . — - 



Protein name 



Locus Name 



yrp protein: multiple regulator protein 



pir :S70842 



ACC# ; 
S70842 



Description 



412 



ORF Name 



NT ID 



AAID 



10588311 c3 274 



NT 
Length 
RFUT — ^ 



AA 



— . Score ■ Probability 
Length • — . — • — — — J ~ 



Protein name 



.Locus Name 



ribonucleoside- diphosphate reductase, beta 
chain 



Description 
ORF Name 



NTID AAID 



1050-2250 12 95 



Protein name 



aluminum tolerance protein 



Description 



|pir:C54135 



NT 
Length 
T72 — I 



AA 
Length 
1793 — 



1495 I 13 '.3e-153 



Locus Name 



pin PC4440 



• Acc# 
C64135 



Score , Probability 



2 .2e-2i 



Acc# , 

PC4440 :PC4 
514 



ORF Name 



NTID AAID 



10751005 cl 182 



Protein name 



Description 



NT . ' 
Length 
T^l - 



AA 
Length 
[452 



Score . Probability 
[TT5 - 



Locus Name 



gp -ABCARRA ' 



5.0e-08 .. 



Acc#, . 
X70360 



■ A. brasilense carR gene. 






i, - ■ , - "T " -- 




' ORF Name '*• 


' NTID 


AAID 


• NT 
Length 


AA ■ 

— , r Score 
Length 


Probability / 


11912951_tl^20 • 


1649 


355.9. ; 


107 . | 


324 




Protein name 








Locus Name. • 


acc#,; . 


Description 












NO-HIT, ■: •■■ ' . .... 


ORF Name 


■ NTID 


AAID 


NT 

Length 


AA ' 
— , Score 
Length , 


Probability 


1297215_c3_289 


1550 


3570 


185 ■■■■ 


558 | 125 


. 5 . Oe-0.8 - 


Protein name 








Locus Name 


'■ :ACC# 


colicin V production protein homolog 




■; pir:H7.01dS 


E70195 



Description 



413 



ORF Name . 


NTID 


AAID 


: ; NT 

Length , 


.. AA 

I Score . 
Length 


Probability 


14275330_t2_68 


1651 


3571 ■ 


489 . 


1470 |377 | 


|9:9e 


-35 


Protein name 










Locus Name 




Acc# 












sp:Y4WB_RHISN 


P55680 


Description .. 
















HYPOTHETICAL ZINC PROTEASE - 


LIKE PROTEIN Y4WB 












ORF Name 


NTID 


AAID 


NT 

.. Length 


AA 

T — ; , Score : 
Length 


Probability 


I4508500_c2_247.- 


1652 . 


|3572 




. 1542 11546 1 


|1.3e 


-158 


Protein name 










i 

Locus Name . 




' Acc# 


- amidopnospnonbosyltransterase , ";. • 




pir :XQEC 




Description 














F65003 :A92 
366 :A92367 
:S0i389:i5 


ORF Name ■ 


NTID 


AAID 


NT , 
Length 


" AA 
— - Score 
Length 


Probability 


14900187_13_134 


1653 


3573 


220- 


GG1 361 • 


4. 9e 


-33. 


Protein name 










Locus Name 




. Acc# 


probable 2 -hydroxyhepta-2 , 4 


-cliene- 


1, 7-dioate 




pir : A64864 




A64864 


isomerase bll80. 






























Description 
















ORF Name '■ 


■NTID 


AAID 


NT.. 
' Length 1 ■ 


AA- ■' , o 

t -^»-h Score 
Lengtn 


Probability ' 


15908263^ciJL;43 




3574. 


, • 108 


327 |307 | 


|2.6e 


-27 • • 


Protein name • 










Locus Name ■ ■ -' 




Acc# ' 


RpsA . v .- - ,: ■ 




gp:AF035937 


-. AF035937 



Description 



Pseudomonas aeruginosa, strain IATS 06 RpsA (rpsA) gene, partialcds ; : . -'. 
Ihf-Beta, Wzz (wzz)T, and .Wzx (wzx) genes, complete cds ; andwbp .gene cluster 
for O-antigen biosynthesis, complete sequence. 



ORF Name 



NTID : .AAID 



16194442 13 135 



~3 



TF55" 



T575" 



NT AA. 
Length Length 
¥F5 n 11377 



Score . Probability 
11285 



6.0e-131 



. Protein, name 



Locus Name 



sp : PU£2_SALTY 



Acc# ' 
P26977 



Description ; , 

RIBONUCLEOTIDE SYNTHETASE) ( PH0SPH0RIB0SYLGLYC1NAMIDE SYNTHETASE) 



414 



ORF Name 



116828790 ET 43 



P r o tein name 



NTID 



AAID 



NT ■ , AA , ,; 
Length Length 
1292 



Score 



Probability 
b.9e-37 



Locus Name 



sp:YJAT)_HAE±N 



Description , 
HYPOTHETICAL PROTEIN HI0432 



; ACC# 

P44710 r 



ORF Name. 



NTID AAID 



19698381 el 189 



NT ' , AA 

Length Length . 
£7T 1 



Score Probability 



EH~] 



Protein name 
Description . 
[NO-HIT . 



Locus Name 



Acc# 



ORF Name 



NTID' AAID 



1972931 t2 63 



NT , ' AA 
Length ••■ Length 





Score 



p7 



Probability 
10,023. — 



Protein name 



Locus Name 



unknown • 



tap.: API 97126 



... ; Acc# 
AF197.12 8 



Description 



Rattus norvegicus 


unknown . mRNA / . 




■-■ ,■ • 1 




ORF, "Name 


. i NTID AAID 


v NT 
Length 


.AA . ^ 
^ i Score 
Length 


Probability ■ 


|20601558_cl_163 


16b9 3b79 


1 ^ • 


822 , . |737 ] 


|7:0e-73 


Protein name 






Locus Name 


Acc# 



Description 



sp:YGHU_EC0LI 



Q46845 



HYPOTHETICAL 34.2 


KD PROTEIN IN GSP-HyBG INTERGENIC REGION 




ORF Name 


NT AA 
NTID AAID , — • — , Score 
Length ■ Length 


Probability 


pill556_cl_164 


| 1660 |3b80 j 1383 | |4152 161 • 


1.3e-29 



Protein name 

Description 
EX0DE0XYRIB0NUCLEA5E V GAMMA CHAIN, 



Locus Name 



sp:EXbC_HAEIN 



Acc# 
P44945 



ORF Name 


NT I D 

i — 


AAID 


NT ' 
Length 


AA 

, Score. 
Length 


Probability : . 


Z1642556 c3 272 


| 1661 


3581 


87 . 




264 




Protein name 










Locus . Name 


Acc# "• 


Description 






■ -i 








NO-HIT ' • . ~ : • 1 , 


■ PlDT? Man-id* 1 

UKr JName . 


JN 1 ID 


AA1U 


NT 
. Length 


AA 

•— . , . Score 
Length 


Probability 


2275138V_c3_27i 


| 1662 


. 3582 


150 




453 | , 




Protein name 










, Locus Name 


Acc# . 


Description 












s ■ 


NO-HIT 














■ ORF Name ' :■ 


NT ID 


AAID 


NT' . 
'.Length 


AA 

' — . , Score 
, Length - - 


■ Probability 


^6lib27_c3_275- 


| 1665 


3583 


. 114 




345 . • 140' 


1.3e-09 


Protein name 










Locus Name 


Acc# 



spiYPAD-HAEIN 



P45154 



Description 
HYPOTHETICAL PROTEIN HI1309 



ORF Name 


NT ID ' AAID: 


NT . 
. .Length 


AA 

, — , , Score 
Length 


Probability ' . 


2!3 6 76 0 3 5_C3_2 62 


| 1664 3584 


410 


1233 |177. | 


|1.3e^i0 


Protein name 






Locus Name 


' Acc# 


YthP ■ 






gp:AF008220 


. AF00822 0, 


Description 


> 








Bacillus subtilis 


rrnB-cLnaB genomic 


region. 








ORF Name • 


NT ID ' AAID' 


NT . 
Length' 


AA 

, — . , Score 
Length 


Probability 


|2372538.7_c2_244 


| 1665 |.|3585 


318, 


|957 | |1046| 


|1.3e-105 . | 



Protein name . ' 

Description 
CELL DIVISION PROTEIN FTSY 



Locus Name 



Acc# 
P44870 



416 



- ORF Name NT ID AAID 


NT 
Length 


AA 

■ v — .., .-■ Score 
Length . 


Probability 


2372846^>_cl_161 1666 5586 


925 


2778 |2856' | 


|2.0e-297 


Protein name 






Locus Name 


Acc# 


pyruvate denydrogenase (lipoamiae) 


gp : AZPDHE 


Y15124 


Description , : - 1 


* i ■ 


Azotobacter vinelandii pdhE gene. ■ s 


ORF Name NTtD AAID, . 


NT 
Length 


. ...... 

AA 

■ — '- . Score 
Length 


Probability 


23?89512_c3_268- 1667 3587 


393 


1182 |1066 | 


|9.6e-108 



Protein name 
Description 
ORF Name 



Locus Name 



Sp:,PHEA_PSEST 



NTID AAID 



24303127 ci 171 



NT ■ AA 
■■ — Score 
Length . Length —■ — — 

[7ST- 



1224 



Probability 
|3\le-79 ~~ 



Protein name 



Locus Name 



carboxynor spermidine decarboxylase 



IgpiVIBCANSDC" 



Acc# 
D31783 



Descr iption 



Vibrio algmolyticus nspC gene tor carboxynorspermidinedecarboxylase~(CANS" 
DC) , complete cds . 



Acc# 
P27603 .-. 



ORF Name 



NTID AAID 



24407750 c3 253 



1669 



T5W 



NT . ' AA . 
Length Length 
1248 - 



7TT 



Score Probability 



l.le-58 



Protein name 



■Description 



Locus Name 



sp:DCOP_HAEIN 



Acc# 
P4 3812 



DECARBOXYLASE) 


ORF Name 


NTID 


AAID 


, NT 
Length 


AA 

_ — . , Score 
Length 


Probability 


24642711_c2_216. 


. |1670 


, [3590 


773 | 


2322 |960 | 


|1.6e-96 


Protein name 








Locus Name 


Acc# 



sp:AROA^BACSU 



P20691 



Description ' _ 

(5-ENOLPYRu^LSHIKI^TE-3-PHOS^TE SYNTHASE) (BMP _ SWVHASH) 



417 



ORF Name 



• NTID AAID 



.NT AA 
~ — — " Score 
Length Length — 



25025987 c2 232 



T5^r 



Probability 
|4.be-46 



Protein name 



Locus Name 



sp:YRBH_BCOLI 



Acc# 
P45395 



Description ■ ' 

HYPOTHETICAL 3b . 2 KT> PROTEIN IN MURA-RPON TNTEkGlillsl 1 C REGION (0328) 



ORF Name 



25431625 c3 2bl 



NTID AAID 
11672. .. 



T^T2~ 



NT ". .\ AA 
Length Length 
1106 I 1521 • 



Score 



TIT 



Probability 
|I.6e-ltt 



Protein name 



Description 



Locus Name 



sp:THPB_ERWCH" 



Acc# 
P37983 



INTEGRATION HOST 


FACTOR tiETATSTOUNlT 


{ IHF-BETA) 




ORF Name 


NTID AAID 


NT 

Length,, 


AA 

„ — Score 
Length . 


Probability 


25445452_cljL44 


1673 3593 . 


215 


648 |313 | 


|6.0e-28 ; 


Protein name 






Locus Name 


Acc# 


conserved hypothetical protein 




pir:F7528b 


■ F75285 


Description 










ORF Name 


NTID AAID 


"' NT • 
Length . 


AA 

, — -. , Score 
Length 


Probability 


2bb64402_c3_28b ^ 


1674 3594 


739 - 


2220 82 


|9.2e-06 


Protein name 






Locus Name 


. ' Acc# 


hypothetical "protein 'SCI7- 24c 




pir -T36920 


T36920 


Description 










ORF Name 


NTID AAID ' 


• NT- - 
Length 


AA , 

— ■ Score 
Length . 


Probability 


26359451_c2__249 - 


167b 3595.. 


713 


2142 |2272 | 


. |l.be-2Jb . 


Protein name 






Locus Name 


' Acc# 



Description 
EXO I NUCLEASE AUO yTOUNIT B 



sp rtJTOE^FSEEE 



P72174 :P72 
14 7 



418 



ORF Name 



NT ID 



AAID 



2675009,0_C.2 245 



NT 
n 
T5T 



AA 

T ' — ^ Score' 
Length Length 



TOT 



Probability 
pO | |4.9e-a8 



Protein name . 



Description 



Locus Name 



sp_:PYfcD_SALW 



Acc# 
P25468 



(DHODEHASE) - * 


GRF Name 


NT ID 


AAID' 


NT 
Length 


AA 

, — -. , Score 
Length 


Probability 


2923562_c2_233 


1677 


3597 


1 177 1 


|534 | 352 


l4.4e.-32 


Protein name : 








Locus Name 


Acc# 



Description 



sp : YRBI_ECOLI 



P45396 :P45 
3 98 



HYPOTHETICAL 20 


0. KD PROTEIN IN MURA 


-UPON INTERGENIC REGION 






ORF Name 


NTID AAID 


NT 
Length 


AA - 
— - , Score 
Length 


Probability 


29336052_rl_41 - 


1578 |3598 


474 | 


1425 |440 | 


p.le 


-46 . 


Protein name 








Locus Name 




Acc# ' 


ABCl protein homo log T15B16 . 14 . 




pir :T02007 ' 


TO 2 007 . . 


Description 














ORF Name 


ntid aaid 


■i NT 
Length 


AA 

— ■ Score, 
Length 


. Probability . 


30173201_12_94 


; ' 1679 3599 


66 


2 


01 . 






Protein name 








Locus Name 




Acc# .. 



Description 



MO-HIT . < - - 


ORF Name 


NTID . 


AAID 


NT , 
Length 


AA 

-j- — . , . Score 
Length 


•Probability 


30469092_ci_151 


1680 


|3600 


250 | 


. |753 | 152 


|6.4e-09 


Protein name 








. Locus Name 


Acc# 



gp.:ML0L622 



Z95398 



Description ! 
Mycobacterium leprae cosmid L622. 



41$ 



ORF, Name . NTID AMD " ■■ — ■ ' ' ; — Score Probabiiil 
— - — ; — ^— — - — —■ — - Length Length — • ; - — — — 



306004b3_fc3_i39 | |16S1 


3601 


697 


2094 |7 7 7 | 


|2.0e-86 ■ - 


Protein name , 








Locus Name 


Acc# 


hypothetical protein fc>2324 > 


pir :B,65005 


B65005 


Description 












ORF Name . ■ NTID 


AAID 


NT. 
Length' 


AA 

„ -■ — . Score 
Length- 


Probability 


307>0027_t3_141 |1S82 


3602 


154 ■ 


I 4 " 1 ^ 


2.9e-35 


Protein name 








Locus Name 


,. Acc# ■ 


hypothetical protein 


gprPPPALl . f 


. "X74218 , 


Description ; 






Pseudomonas putida ruvB, 


toig, toiR, 


tolAv 


tolB and oprL- genes . 


ORF Name NTID 


AAID 


-NT 
Length 


AA 

— ■ Score' 
Length 


-.Probability 


31800280_cl_158 |1683 


| 3603 


305 


918 . 649 


1.5e-63 


Protein name 








Locus Name 


. Acc# 


hypothetical protein 


gp : PFFC2 • 


Y11998 ' 


Description' 












P.tiuorescens FC2.1, FC2 . 


2, , PC2 .3c- # ■ 


FC2 . 4 and FC2 . 5 c open, readingt rames . 


ORF Name NTID 


AAID 


NT , 
Length . 


AA - : 1 ' 
, , Score 
Length 


Probability 


318282il_t2_69 1584 


■ ; , 3604 


1208/ 


3627 2934 , 


|6.0 


Protein name 








Locus Name 


ACC#. . 


proline dehydrogenase 








gp:ATU39263 ' 


U39263 


Description . 












Agroioacterium., tumef aciens 
(prp) genes, complete cds . 


plasmid pAtRlO proline dehydrogenase (put A) and Prp 




ORF Name '■ NTID 


AAID .. , 


NT 
Length 


•;' aa 

— L1 ■ Score ... 
Length 


Probability 


33229667_c3_270 , , 1585 


|3605 


72 . 


P 19 1 ■•- 




Protein name. ' 








Locus Name 


. Acc#' 



Description 
IM0-H1T ' 



ORF Name 



NTID AAID 



■■ NT 
Length 
288 



AA 
Length 

[867 



Score Probability 



Protein name 
Description . 
NO-HIT ' r 



Locus Name 



Acc# 



' ORF Name ' NTID AAID 


NT •' 
Length 


AA 

— - Score 
Length 


Probability 


34062503_ci_178 1587 | 3607 


181 


|546 | 224 


1.6e 


1 


Protein .name 






Locus Name 




Acc# 








Sp:YHBN_HAElN 




P45074 


Description 












HYPOTHETICAL. PROTEIN Hid: 14 9 PRECURSOR 


ORF Name . NTID AAID . ' 


NT 
Length 


AA " 
. — ■ , , Score 
Length 


Probability' , 


34i72883__ci_i76 1688 |3608 


.166 


501 296 


3 . 8e, 


-26 ; | 


Protein name 






Locus Name 




Acc# 








sp:YJEE_HAEiN 




P44492 


Description 












HYPOTHETICAL PROTEIN HI0065 PRECURSOR ' \ S . 


ORF Name .. . NTID AAID 


NT . : ' AA . , 
- — — — Score 
Length Length 


Probability 


|34409658 12 84 1689- | |3609 


36 0 


1083 . 757 | 


|5.3e 


7b " 1 


Protein name 






Locus Name 




■ Acc# 


carboxyl esterase ' ■ . 




pir:S57530 " 




S57530 


Description 










V . 


ORF Name . . NTID AAID 


NT 
' Length. 


AA 

. - — . , Score 
Length 


.Probability 


55i57165_cl_19i | 1690 p&io 


204 


615 ■ ' 303 < 


2,7e 


-29 | 


Protein name 






Locus Name 




Acc# 


methyl a tedrDNA- -protein-, cysteine , 
S -me thyltransf erase, 




pir:D64604 




D64604 • 











Description 




■.ORF Name 



NT ID 



AAID 



NT " AA 
T > t Score 
Length , Length : 



Probability 



36072135_c3_2yb 


1591 


|3611 




• 20 r 5I 564 


i.Se-54 | 


Protein name 










Locus Name. 


Acc# . 












sp:EX5A_ECOLI 


Description 












378': 


ALPHA ChAin) 


ORF Name 


, NTID 


AAID 


NT 
Length 


AA 

. • — . '/ Score 
. Length 


Probability 


36ii2900xt2_m 


1532 


3512 


103 


312 253 


1.4e-21, 



Protein name 



Description 



Locus Name 



:.ECU24202 



Acc# 
U24202'; 



Escherichia coli ,ECOR bu (yciD)- gene , partial cds, and (yciC) , (yciB) , - 
(yciA) , membrane ' protein • (tonB) , (ycil) , putative potassiumchannel (kch) , . and 
cardiolipin synthase (els) genes i", . complete cds.. 



. ORF Name ' NTID 


AAID 


NT 

Length 


AA 

■ — « . . , Score 
Length . 


Probability, 


35129576_c3_252 159 J . 


3613 


139,.. . 


J420 | |85 | 


|0 .00086 


Protein name \ 






Locus Name 


.. Acc# 


hypothetical protein yrvD 






pir:G69980 


G69980 


Description 










■. ORF Name NTID 


AAID ■■■■ 


NT 
Length. v 


AA 

• — , . Score 
Length 


Probability '■' 


3915943_c2_22& 1594 . 




414 


|124b | 1153 




Protein name 






Locus Name 


'•: Acc# 








sp : METZJPfWAIi! 


P5521.8 


Description 










■0-SUCCINYLHOMOSERINE SULFhYDrYLASE , 


10SH SULFHYDRYLASE) 




ORF Name NTID 


AAID ■. 


Length 


AA n • 
v — . Score 
Length 


Probability 


3922193_:fc2_50 , 1695 : 


3515 


381 


1145 [708 | 




■i ■. j 

Protein name 






Locus -Name 


; ACC# 


probable pvas protein 


pir:B70Syr 


B70591 


Description 



ORF Name 



NT ID 



AAID. 



5333457 c2 224 



NT AA 
Length Length 
202 



Score Probability 
, |387 | |8.5e-36 



Protein name 



Locus Name 



hypothetical protein ;jhp0867 



pir:B71875 



Acc# 
■; B71879 



Description 
ORF Name 



NTID . AAID 



14016943- 12 67 



NT AA. 
Length Length 
1473 • I 11422 



Score 



Probability 
3.6e-64 — " 



Protein name . 



Description 



Locus Name 



sp: Y4WA_RHISN 



Acc# 
P55679 



HYPOTHETICAL ZINC 


PROTEASE Y4WA, 












. ORF Name 


NTID AAID 


NT 
: Length' 


AA 
Length 


Score 


Probability 


4103293_cl_179 


1698' 3618 , 


245 


738 




822 


6.9e-82 • . 



Protein name 



Locus Name 



putative ABC transporter ATP -binding protein 



tap :-AF013 987 



Acc# 
AF013987 



Description 



Vibrio cholerae strain 0395 putative ABC transporter ATP^bindingprotein, 
sigma54 (rpoN) : putative sigma54. modulation protein andnitrogen ' regulatory 
IIA protein (ptsN) 'genes, complete cds . "• 1 ." 



ORF Name 



NTID AAID 



14114702 cl lb 9 



1W 



3619- 



NT AA : 

Length Length 
TT9 ~] \JW 



Score 



Probability 
2.4e-lB 



Protein name 



Locus Name 



probable dihydroneopterin aldolase, 



pir -H65093 



• Acc# - 
H65 093 



Description 
ORF Name 



NTID AAID ■ 



4489463 12 90 



1700 



NT 
n 
WIS 



AA 

^ T ^ SCQre 

Length Length — r: 



Probability 



TTTW 



Protein name 
Description 
O-HIT ~~ 



Locus Name 



; ACC# 



42 3 



ORF Name 


NTID 


AAID 


NT:. 

Length 


AA ' 

T — , , , Score 
Length ■ 


Probability 


4689693_c3_278 


|1701 


3521 


3.72. 


.. 1119 1 1509 1 


|8;2e-55 


Protein name 










Locus Name 


Acc# 












sp :MIAA_HAEIN 


J P44495 


Description. 














(IPP TRANSFERASE) ;; % • 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
, Length 


Probability 


47720b0_t2_86 


1702 . 


1522 


441 | 


|1325 |487 | 


|2.2e-46 



Protein name 

Description 
DNA POLYMERASE III, EPSILON CHAIN, 



Locus Name 



Jsp:DP3E_HAEIN 



Acc# , 
P43745 



ORF Name 



NTID AAID 



4816513 c3 294 



T7UT 



NT . AA , 
Length Length 
1318 . I 13957 : 



■ Score Probability 



|230 | |3 .le-4I 



Protein name 

Description 
SETA CHAIN), 



Locus Name 



sp: EXbU_ECOLI 



Acc# 
P08394 



ORF ; Name . 
4863458 c2 ; 234 



NTID AAID. 



NT 



AA" 



Length " Length 



Score ■ Probability 



T72 



Protein name, 
Description 
[NO-HIT — 



Locus . Name ' 



Acc# . 



ORF Name 



NTID 



AAID 



NT AA . 
— " — Score 
Length Length 



4876525 c3 293 



T7u3" 



Protein name 

Description 

INO-HTT 



Locus Name 



Probability 



Acc# 



424 



ORF Name 



i NTID 



1487'aiJb c'A 2b0 



AAID 



NT AA 
— , . — Score 
Length Length •- ■ ; — 



rorr 



Probability 
|1.3e<-8 0 — 



Protein name 
" yhdCi homolog 



Locus Name " 



gp:AF040378- 



Acc# 
AF040378 



Description 



Serratia marcescens nbosomai protein LIT methyl transferase -• (prmA) gene / 
partial cds; and yhdG .homo log and small DNA binding proteinFis (fis) genes, 
complete cds. 



ORF Name 



NTID AAID 



4881700 c3 290 



T7TTT" 



TZTT 



Protein name 



hypothetical protein b 



Description 



NT 
Length 
479 V 



AA 



— ? Score Probability 
Length ' • ■ — ~ - ■ 



TT4TT 



Locus Name 



pir :T001U1 



3.7e-35 



Acc# ' 
T00101 



ORF Name 



NTID AAID 



NT 

Length 



AA 



— , Score Probability 
Length — ~ — • — - 



4884575_c3_283 


1708 . | 3528 


253 1 


752 13 8 


1.5e-07 


Protein name 








Locus' Name 


Acc# ' . 


hypothetical protein 




gp:AF031940 


AF031940 


Description ■ 












Smorh.izo.bium 


meliloti alcohol dehydrogenase 


(adhA) gene, ( 


jompletecds . 


ORF Name - , , 


. NTID AAID • 


NT 
Length 


AA 

— Score 
Length ■ 


Probability 


50?5593__c3_277 


1709 Jb^y 


419 


1250 . 7-93 


8.2e-79 


Protein name. 








Locus Name 


Acc#. . . 


hypothetical j. 


)rotein slr0049 






pir: 374347 


S74347 


Description • 












ORF Name" 


NTID AAID 


NT 
Length 


AA 

— Score 
Length 


Probability 


5098937_!2__bl 


1710 | 3530 


.543 1 


1532 Ib20 


| |3.1e"-l>3 


Protein name 








Locus Name 


Acc#. 


probable exodeoxyribonuclease VII large 


; pir:C75b49 


C75549 


subunit 























Description 



425 



ORF Name 



NTID AAID 



5110963 cl 162 



mr 



- NT 
Length 
b.59 I 



AA . 
Length 
11580 



Score 



11055 



Protein name 

Description 
COMPLEX, (E2) 



Locus Name 



sp:01^2_E>SEAE 



Probability 
ll.le-106 . 

- .. , Acc# 
. Q59638 , 



ORF Name 


NTID 


AAID 


NT 

Length- . 


AA 

T — L1 Score 
Length 


Probability 


|5112763_t2_89 • 


11712 


13632 


|pvv | 


|834 | 364 | . 


|2.4e-33 | 


Protein name 








Locus Name 


Acc# 










sp:YDGM_HAEIN 


. P71396 


Description 












PUTATIVE FERREDOX IN -LIKE PROTEIN HI1684. : 


ORF Name. 


NTID 


-AAID 


NT 

Length 


■ AA ' " 
: — ■. • Score 
Length 


Probability. 


|5323750_t3 108. 


1713 


3633 


104. 


315 




Protein' name 








Locus Name 


Acc# 


Description 












MO-HIT • . • " - • r v 


ORF Name' 


NTID : 


AAID' ' 


.-. NT- 
Length 


AA 

j — \ , Score 
Length 


Probability 


|6484691_tl_26 


[I'/14. 


3634 


|377 


1134 ' |738 |; ; 


|b:be-73 | 


Protein name 








Locus Name 


Acc# .. 










■ K sp : <L7,SP_EC0LI 


P16700 


Description 












TH I OS ULFATE -BINDING PROTEIN PRECURSOR 


ORF Name 


NTID - 


AAID 


\: NT 
Length 


AA 

— , Score 
Length 


'', Probability 


|806512;_c3j279' 


J 1715 3635 


|137 . 


414 |171 


i-.8e.-12 


Protein name 








Locus Name 


Acc# 


polysialic acid capsule expression 


protein 


pir :B70434 


B70434 



Description ■ 



426 



ORF Name. 


NTID 


AAID . 


NT 
Length 1 


AA 

— . , Score 
Length 


Probability 


Tl 552 c3 7 ■ 


: J1715 - 


3636 


1 78 
1 


237 1 
1 




Protein name 








Locus Name 


Acc# 


Description 


















... :■ 








ORF Name 


NTID 


AAID* ' 


NT 
Length 


AA 

, — .., Score 
Length 


Probability 




1717 


3537 


" 70 1 
1 


1213 1 

r 1 




Protein name k 








Locus Name. 


Acc# 


Description 












NO -HIT -• : - ■ ' ;. 




ORF Name 


. NTID 


AAID 


: . NT 
Length 


AA 

' : — i Score 
Length 


Probability 


35117i35_tL_l 


.1718 


3538 


33b | 


|1008 | |1334 | 


,|3.8e-136 



Protein name 



Locus Name 



malate dehydrogenase ; 



gp:A>'10%B2 



Acc# . ' 
AF109682 



Description ' - 

Aquaspir ilium arcticum malate dehydrogenase (MDH) gene, completecds . 



ORF Name. 



NTID AAID 



11719 



3539. 



NT AA 
' Length ; . Length 
1 



— : ' Score Probability 



Protein name 



Description 



Locus Name 



Acc# 



NO-HIT ■ 


ORF Name 


■NTID 


AAID 


NT 

Length ■ 


AA • 

■ . — . , Score 
Length 


Probability 


13i558403_il_l 


1720 


3640, 


399 


|1200 | l2bl 


p.4e-127 



Protein name 



Locus 'Name 



sp:YLIG_ECOLI 



Acc# 
P75802 



Description 

HYPOTH ET ICAL 49.5 KB PROTEIN IN MOE A-DACC INTERGENIC REGION 



427 



ORF Name 


NTID 


., AAID 


NT 
Length 


AA 

T — Score 
Length 


Probability 


|16&062bOJ:2_4 


[1721 


3641 


140 


423 236 | 


4 v .8e-19 


Protein name, 








Locus Name 




Acc# 


unKnown 








gp:AP02^544 




AF026544 


Description 






• 








Ralstonia eutropha phtiF and beta-ketothiolase (bktB) genes , complete cds; and 
unknown genes.. 


ORF Name 


NTID 


AAID 


NT 
Length 


' AA - '■ 
Length 


Probability 


20782550_t3_i6 


1722;, 


3642 . .. 


242 


729 933 1 
1 


1.2e 


-93 ; • 


Protein name 








Locus" Name , 




- Acc# 










sp :MTNG__NEIGO 




P08455 , 


Description 














METHYLTRANSFERASE ' NGOPIT ) 


(M.NGOPJ!!) 










ORF Name 


NTID 


AAID 


NT 

Length 


AA 

• — , ■' Score 
Length. , 


Probability 


24328950_r2_8 


| ..[1723 


|3643 ' | 


153 | 


|462 | . 335 


|2 v 2e 


-30' 


Protein name 






'i 


, Locus ' Name 




' : Acc# 










; sp : YRFH^ECOLI 




■ P45802 


Description 


i 












-.HYPOTHETICAL 15.5 


KD PROTEIN/ IN MRCA-PGKA INTERGENIC REGION ■ 


10133) 




ORF Name 


\ NTID 


/ AAID 


nt; 
Length 


AA ■ ■ 
„ ~ ~ , , Score 
Length 


Probability - 


2983979u_c2_32. . 


| |1724 


| 3544 


74 :, 


|225 






Protein name 








. Locus Name 




Acc#" 


Description 














NO- HIT . : . ■ . -i 


ORF Name • 


. NTID 


AAID ' 


NT 
Length 


AA 

— - Score 
Length 


Probability 


3942592 J:2-_6, 


|1725 




1252. 


|759 . | |741 | 


2.6e 


-73 


Protein name 








;Locus Name 




■ Acc# 


hypothetical protein, 26K 






pir:JCb4/9 




JC5479 . 



Description 



ORF Name ■ NT ID 


AAID 


NT 
Length 


AA. ■ 
, . , Score 
Length , 


Probability 


4103390_t3_15 | 1725 


. 3546 


F 


258 . 323 


5.2e-29 


Protein name 








Locus Name 


Acc# 










spiMTNejJJUISO 


P08455 


Description 










_ __ 














ORF Name NT ID 


AAID 


NT 
Length 


AA 

: — - , Score 
Length 


Probability 


42837_ri_2 • 1727 






216 [170 | 


|2.be-12 | 


Protein name 








Locus , Name 


Acc# 










sp :.MTNG_WU1U0 


P08455 


Description • 












1 MT?TtiVT/rPATjqETPAgE' TIflfiPTTI 


i m ttfntnp tt i 










ORF Name NT ID 


AAID 


NT 
Length 


:;• ^A 

t : — , -l Score 
Length. 


Probability 


4976512_t2_7 • 1728 


. 3648 


518 


1557 [1255] 


|7.8e-129 


Protein name 








Locus Name 


\ - Acc# . 


threonine dehydratase, biosynthetic 




.; pir:E75502 


E75502 


Description i" 












. ORF Name ' NT ID 


AAID 


NT 
Length 


. AA ' '■. 
-■ — ; , Score 


. • ' ! # • ; 

Probability. : 






Length ■ 




7638307_tl_3 " 1729 ■ 


3649 


12 7 • 


384 228 

■(>'- 




Protein name , 








Locus Name 


Acc# 










sp:PAlP_HUMAN 


— 1 P24666:Q16" 


Description 










03,5:Q16725 


■(EC \3.i. 3. 48) (Adipocyte acid phosphatase, 


■ISOZYME ALPHA) 




ORF ".Name ,. NTID , 


• AAID-' 


NT 
Length 


AA 

— . , Score 
Length 


Probability 


301759b0_ll_l 1730 


| 3650 


77 ... 


234 292 I 


5.8e-25 


Protein name , 








Locus Name 


Acc# ' 



sp:THIC_BACSU 



Description 
THIAMINE BIOSYNTHESIS PROTEIN THIC 



P45740:P71 
090 



429 



ORF Name 



NTID 



AAID 



i44 V 0im 13 b 



T7TT" 



. NT • AA 
Length ' Length 
" 



Score 



T7T" 



Probability 
4-.5e-62 ~ 



Protein name . 

Description 
THIAMINE BIOSYNTHESIS PROTEIN TH1C 



Locus Name 



sp:THIC_ECOLI 



Acc# ' 
P30136 



□ 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


|711900i_t2_4 


1732 


| 3652 


| 73, | 


222 | |8b . | 


|0.013' 


Protein name 


} 






Locus Name 


Acc# * 



sp:YA51_HAEIN 



Description ;i . , : 

HYPO T HE TI CAL ABC- T RANSPORT ER ATP -BINDING PROTEIN HT1051 



Q57180:O05 
043 



ORF Name 


NTID - 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


242^4702 13_3 


• 1733 


| 3653 | 




|7b3 | 2S3 


|i.4e-2i • 


Protein ' name 










Locus Name 


Acc# ; 












sp: Y1AT_EC0LI 


. P37681' 


Description 














HYPOTHETICAL 2 7.4 KD PROTEIN IN AVTA-SELB INTERGENIC REGION . PRECURSOR ; 


ORF Name 


NTID . 


AAID " " 


" . NT 
Length 


AA • 
T — Score . 
Length 


Probability 


2b7B7bOO^c3^6 : , 


1734 


13654 


1278 


-837. |503 | 


|4 . 4e-48 


Protein name 










Locus Name 


acc# , 












sp:BFRA_Nfc!100 


P72080 - 


Description 














bACTERIOfeRRITIN A 


ORF Name 


NTID ' 


AAID 


■ NT 
Length 


AA 

, , Score 
Length 


Probability 


5133575_c3_7 


1735 




152 


4S5 | |485 | 


|1.3e-46 | 


Protein name 










Locus Name 


Acc# "... 












sp:BPRB_NEIGO 


P77914 .. 


Description . 










i. 




BACTERIOFERRITIN B 


(BPR-A) 


(RPR B) 






.i ■ 





43 0' 



ORF Name 



NT-ID AAID 



12198437b cl 10 



TTJT 



NT 
n 
T7T 



AA 

— ' . Scor e 
Length Length — — 



•|706 :| 



Probability 
|8.3e-70 



Protein name . 

Description .• 
ACT IN INTERACTING PROTEIN 2 



Locus Name 



sp:AIP2_YEAST 



Acc#, r 
P46681 



ORF Name 
|2353!>Mu_tlJ> 



NT ID 
T7TT 



AAID 



TOST 



NT AA 
Length Length 
F^T" 1 11479 " 



Score 



Probability 
|1.4e-152. v 



Protein name ; ' 

Description 
PUTATIVE PROTEAN E YUUO, 



Locus Name 



sp : VEGQ_1!C0LI - 



Acc# 

P76403 :O08 
007 :O08010. 



' ORF Name 
|25677176_t^6 



'NT ID 



AAID 



3558 



'. NT AA 
Length . Length 
1205 I 1515 



Score 



"5HT 



Probability , 
|1.8e-49 : 



Protein, name 



Locus Name 



site-specitic DNA- me thyltranst erase 
(cytosine-specif ic) HP1121 



pir : A64660 



. Acc# , , 
A64660 



Description 



ORF Name 



NTID 



'AAID 



3925252 c2 13 



1739' 



NT 

in 



AA 

— - Score 
Length Length , - — 



] [ 



TIT" 



Probability 
|8. 5e-08 



Protein name 



Locus Name 



TerZ 



gp:AF158355 



Acc# 
AF168355 



Description 



Proteus mirabilis tellurite resistance locus, complete sequence ; and unknown 
gene J ... , 



ORF Name 


NTID, AAID 


NT 
Length 


AA 
Length . 


Score 


Probability " 


3945943_11_1 


1740 3550 


P° . 1 


1533 


• BO 8 | 


|2.1e-80 



Protein name 



Locus Name .' 



OprM 



[gp 



:AB011381 



Acc# 
AB011381 



Description 

Pseudomonas aeruginosa gene tor OprM, complete cds . 



431 



ORF Name V 


NTID . ' AAID;' 


NT 
Length 


AA 

— \ Score 
Length 


Probability 


21106bV_c2J4. 


' | 1741 3661 


221 


663 |B70 | 


|3.be-55 


Protein name 








Locus Name 


• Acc# •' 










sp :Y926_yYNY3 


P72872 


Description 












XI 1 ±T \J ± 11 Hi J. -L ^_^-i.J_l J / 


_7 1\ 1 J IT 1\. W _L -C-j _L 1M O 1 II lU 










.-, ORF Name 


NTID AAID 


NT 

Length 


AA 

- — , Score 
Length 


Probability 


|16040aa7_tJ_il 


| |1742 . |3662 | 


504 


|1512 , p557 


p.6e-266 


Protein name 








Locus Name 


Acc# 


unknown • 






gp:Al l 039J12 , 


AF039312 



Description 



Moraxella catarrhalis strain 4223 transterrm binding protein A(tbpA) and 1 
transferrin binding, protein B (tbpB), genes,' completecds; and unknown gene . 


ORF Name 1 . NTiD, AAID 


NT 
Length 


AA ' 

— , .'Score 
Length 


• Probability 


|4016b63_cl_13 ! ||1743 |36S3 


I 1 ? 8 


P 7 1 




JO, 00021 .. v j 


Protein name v 




- . Locus Name 


Acc# ,; 


conserved hypothetical . protein ykoJ 




pir :F69859 ' 


F69859 . 


Description ' ' 










ORF Name ■ NTID AAID 1 


NT 
Length 


AA; ■ 
. — : ■ Score 
Length:, 


: Probability 


4484b67_tl_l v j 1744 3664 ... 


899 , 


2700 


4565 


0 .0 ; - ■ • 


Protein name 




■;' Locus Name • 


Acc# 


transferrin binding.. protein A 




gp:AF039312 


AF03 9312 



Description 



Moraxella catarrhalis .strain 4223 transferrin binding protein A(topA) and: 
transferrin .binding protein B (tbpB) genes, completecds; and unknown gene. 


ORF Name 


NT AA 

NTID AAID -7^ • '-. — ■ Score Probability 
, Length Length ■ - - 


|4775207J:2_9. 


1745 • 3-665 ■ | .173 522 |728 | |l,6e-71 . | 



Protein name 



Locus Name 



transterrm binding protein A 



|gp:AF039315 



ACC# "r 

AF039315 



Description 



Moraxella catarrhalis strain Q8 transterrm binding protein A(tbpA).and 
transferrin binding protein B (tbpB) genes, completecds; and unknown gene. 



432 



ORF Name 


NTID . 


AAID 


NT. 
Length 


AA 

T — . , Score 
Length ■ ■ 


Probability 

■ -U - ■ 




|1746~ 


" 3666 | 


65 


1198 1 




Protein name 








Locus Name 


Acc# 


Description * : 












no-hit • .-■ - • • ' " v , ■■ 


ORF Name 


•NTID 


AAID 


NT 
Length 


AA 

. — Score* 
Length 


Probability 


35361043_ciJ7 


1747 


| |3SS7 


62 


1159 1 193 1 
III 1 


10.00048 . 


Protein name 








Locus Name 


- Acc# 


phosphate-binding protein, ■ . 
phosphate - repressible 


pir : 164120 


. 1 164120 
1 






Description 












ORF Name 


NTID 


'AAID - 


NT ■ 
Length 


AA 

, • ~ . , ' ' Score 
Length 


Probability 


36501b61_c3_9 


1745. 


3668 . 


301 


903 |842 | 


|5 .2e-84 


Protein name' 








Locus Name 


Acc# • / 










: sp:PS!rC_HAtllW 


•; P45191 


Description 












PHOSPHATE TRANSPORT 


SYSTEM 


PERMEASE 


PROTEIN PSTC • 




ORF Name . 


NTID 


AAID ' 


NT 
Length 


AA - . " 
„ ■ — . , Score 
Length 


Probability 


5950433 ci 8 


1749 • 


||3669 




183 ; 




Protein>name ' 








Locus Name 


. ; ' . Acc# 


Description 












no-hit - ■ ■ ' ; •• . ' ' ' ' .; 


. ORF Name 


NTID 


AAID 


NT 
Length 


. AA ■ - 

— Score - 
Length ■■ p 


Probability 


|4429510_U_1 


1750. 


3670 


477 


1434 |1328| 


[1.7e-i3LJ 


Protein name 








Locus Name 


Acc# 










sp:MANB_SALMO 


Q01411 ; 


Description 












PHOSPHOMANNOMUTASE , 


(PMM) 











433 



ORF Name 



NT ID AAID 



4459376 c2 lb 



]'E 



7"5T~ 



NT 
Length 

2 94 "v 



AA . 

T — *iu Score 
- Length - 



F75" 



Probability 
lv0e-5b 



Protein name 



Locus Name 



conserved hypothetical protein 



pir:t)753li 



. Acc# 
D75311 



Description 
' ORF Name 



NTID . AAID 



10429517 cl 34 



[T7F2~ 



NT 
Length 
ITT" " 



AA 
Length . 
1 1242 



Score 



57T" 



Probability 
1. 7e-55 ~ 



Protein name 



Locus. Name' 



conserved hypothetical protein 



bir:A75525 



Acc# . 
A75525 



Description 

ORF Name , » 
12354525 c3 48- ; 



NTID 



AAID 



1753 



TZ7T 



NT 
Length 
352 



T -~ r \.i_ Score 
Length . — — — 



TuW 



Probability 
II. le-35 — 



Protein name 



Locus Name 



spiYGBOJ^OLI 



Acc# <. 
Q57261* 



Description _ <■ . ti . , . - ; 

HYPOTHETICAL 39 . 1KD PROTEIN' IN 5URE-CV5.C INTERGENIC REGION - 



ORF Name 


. , - NTID 


AAID 


NT " 
Length 


AA ■ , ■ ■ . 
— . , .Score : 
Length 


Probability . 


,15915525_12. 13 


,||1754 


| 3574 


1 I- 70 ' 


i 




Protein name 








, Locus Name " 


Acc# 


Description . 








' ".■ ', ■ '■ ■ 




NO -HIT ' - • 


ORF Name 


NTID - 


aaid •'; 


NT ;'" 

Length; 


AA 

.— , , Score 
Length - 


Probability 


21573425_12_i2 . 


|1755 


3575 


| 447 


(1344 ,| |507 | 


|1.7e-48 


Protein name 








Locus , Name 


1 ' Acc# ." ; 



Description 
UB1H PROTEIN, 



Isp : UBIH_ECOLI 



P25534 



434 



ORF Name 



NTID AAID 



2195931 ci 29 



HT71T 



NT , 
Length 
16 9 



AA 



Protein name 



conserved Hypothetical protein aq_2107 



Description 



Score Probability 

Length 

pio | |93 | ; |Q.ooi8 ; ~ 

Locus Name Acc# 



E 



ir :F70480 



F70480 



ORF Name 


• NTID 


AAID 


' NT 
Length 


AA 

— . , Score 
Length 


Probability 


22355001_i3_23 


1757 


3577 




222 | 




Protein name . 








Locus Name 


. Acc# 


Description 












NO -hit " ' 


ORF Name 


NTID 


AAID ; 


NT 
Length 


AA 

— Score . 
Length 


Probability 


239S4035._c3_50 : 


|1758 


| 3578 


, 175 r | 


' 528 |119 | 


|3.8e-05 


Protein name 




i 




Locus Name 


•Acc# 


. conserved hypothet 


ical protein aq_2107 


pir:F70480 


F70.480- 


Description 


. i, . 








( •- 


ORF Name 


NTID 


AAID "'■ 


. ' - NT , 
, Length 


AA 

' — r ■ Score • ■ 
Length 


Probability : 


|25555885_e2_36 


1759 


3579 


241 


725 389 


5.3e-45 ' | . 


Protein name 








Locus Name. . 


Acc# , 










• sp:MIAEjSALTY 


Q08015 


Description 












TRNA- (MS 12 ' ■:. \"' .. •. 


ORF Name : 


NTID 


AAID 


NT , 1 
Length 


AA 

— . Score 
Length : — 


Probability 










|30i00880_c3_5i 




| 3580 


P 0i J 


505. |185 | 


|1.7e-14 - | 


Protein name 








, Locus Name 


. Acc# . ' 


. hypothetical protein aq_2108 




pir:G70480 


i. G70480 . 



Description 



435 



ORF Name 



39940b2 FT! 



NTID 

3^ 



AAID 



NT , 
Length 
1192 



AA 

, Score Probability 
Length — — ■■- - ' 



5T3~ 



4.7e-67 



Protein name 



Locus Name 



probable actp deaminase 



pir:B71565 



Acc# 
B71565 



Description 
ORF Name 



NTID AAID 



4109790 cl 28 



NT 
Length 
155 



AA 
Length 
1468.. 



Score Probability 
[T7^— I |2.2e-12 



Protein name 



Locus Name 



conserved hypothetical protein aq_2107 



pir :F70480 



Acc# 
F70480 



Description 








ORF Name 




NT AA 
NTID AAID ' — '■ ; — " v Score 
.Length LengLh 


Probability 


431983 7_c2_3 7 




1753 | [3583 | 493 |1482 ; |456 | 


|3.7e-44 


Protein name 




. Locus Name 


Acc# 






. sp:YJEt\ECOLI 


P31806 


Description 








HYPOTHETICAL 


54- 


7 KD PROTEIN IN PSD-AMIB INTKRGENIC REGION (URF1) ^ ' 


ORF Name : "' 




NT .AA 
NTID AAID — , , • _ — ; Score 
Length Length 


Probability , 


4345068J:3_21 


r -. i. 


| 1754 | 3584 | 128 387' (177 | 


|l.tie-i3 


Protein name 

i 




Locus Name 


Ace# " , • 






sp :.V0HJ_EC0L1 


P33372 


Description 








HYPOTHETICAL 


14 


5 KD PROTEIN IN PBPG^CDD INTERUEN1C, REGION 




ORF Name 




NT AA 
NTID, • AAID — . • — = Score 
■ - Length Length: 


Probability 


479053 7_t3_22 




[1755 . [3585 ;. 183 ■ 552 . 295 


4.8e-26 



Protein name . - 1 - 

Description ■ 
HYPOTHETICAL PROTEIN HI1298 



Locus Name 



sp : YOHK_HAEIN 



; ACC# 
P4 514 6 



436 



ORF Name 



NTID 



AAID 



5900203 FT r 



11755 



13685 



NT 
Length 

m^. — 



AA 
Length 
12070 



Score Probability 
[1509 | |2.8e-ife5--~ 



Protein name 



Description . 1 



Locus Name 



sp:REP_ECOLI : 



Acc# 
P09980 



ATP -DEPENDENT DNA 


HELKJASfc! kfclP, 








ORF Name 


NTID AAID 


NT ■ 
Length 


AA 

— , Score 


Probability 






Length ■ — — — - 




6558527_c2_39 


1757 . 3587 


|184 


228 


5.6e-18 


Protein name . 






Locus Name 


' Acc# 


conserved hypothetical protein aq_Zl07 . 


pir:F70480 


F70480 


Description. 










ORF Name 


i 

NTID AAID ; 


NT 
Length 


AA 

f — ^ Score. 
Length 


Probability 


7225518J:2_i7 


| |I768 |3588 


102 


309 . . 112 


1.2e-05 


Protein name 






■ ■•. 1 Locus Name 


Acc# • 


hypotnetical protein- 




gp:POL010393 :; 


AJ010393 


Description ■ • " ' . ; 


Pseudomonas oieovorans phal and phaF genes, 


and ORFl , • ORF2 (partial ) and 


ORF3. 










ORF Name 


NTID AAID 


NT 
Length 


AA 

T " — ^ Score , 


Probability 






Length —■ — - 




22B97332_c2_I5 : . 


, 1759 3589. 


lb3 


452 |338 | 


|1.3e-30 


Protein name 






Locus Name 


. : Acc# 








sp : FMAH_BACNO 


P04953 


Description 










SUBUNiT^ PTLINJ 


f ■ - ■ 








ORF Name 


NTID AAID 


NT . 
Length 


AA . ' ' 
_ — Score ■ 
Length 


Probability 


352I087-5J:2_3 


| 1770 | |3590 


883 


|2552 | |3272 | ; 




Protein name 






Locus Name 


ACC#' 



Description 



sp:AC02_EC0Lr 



P3 66 83:P36 
648:Q59382 
:P75652 



(ACONITASE 2) 



ORF Name . , 


NTID AAID 


IN 1 

Length 


AA 
Length 


Score 


Probability 


14853143_cl_9 


| 1771. | 5691 


: 703 | 


2112 


ji460 | 


|1.7e 


-149 


Protein name 








Locus 


Name 




Acc# 










Sp": YHGF_NEIME - 


Q51152 


Description 


)' *» * ' - 














HYPOTHETICAL 8 3 


1 KD PROTEIN IN 


REGION E 












ORF Name 


NTID AAID 


NT ; 
Length, 


AA 
Length 


Score 


Probability 


[16050817 Jrl_± ■ 


| (1772 | [3592 


| 225 




\ m 1 


2.7e- 


-16 



Protein name • 



Locus Name 



hypothetical protein S110788 



tpir:S77018 . 



■ Acc# ' 
S77018 



Description . 
ORF Name 



NTID . AAID 



10175877 .13 73 



117 73 



I3S93 



NT AA 
Length Length 
1254 



Score 



Probability 
2.2e-ll 



Protein name 



Locus Name 



DnrD protein 



|gp : PS T13 171 5 



Acc# : 
AJ131715 



Description ■. 
Pseudomonas stutzeri dnrD gene ; and ORF194 . (partial) and ORF63 (partial ) . 



ORF Name 



NTID 



AAID 



NT 



AA ' 



Length Length 



- Score . ' Probability , , ; 



10195250 12 49 



|1774 




3694 




81 




2U | 



Protein 1 name 



Description ' 



Locus Name 



Acc# ;« 



NO-HIT. . ; . y - ■ 




ORF Name 


NTID 


AAID ' 


NT s 
Length 


• AA n : 

— , Score 
Length 


Probability 




10546930_fl_18 


| 1775 


13695 


1 ™ \ 


■ = 720 . |576 | 


|8.1e-5<5 


1 


Protein name 








Locus Name 


Acc# 





sp :MODB_HAEIN 



P45322 



Description . " '-' 

MOLYBDENUM TRANSPORT SYSTEM PERMEASE PROTEIN MODB 



438 



ORF Name 



NT ID AAID 



11113152 .ti 70 



17 76 



AA • o 
— h — , Score 

Length Length ■■ ■ - 



NT 
n 

TF2 



Probability , 
|i.7e-16 — ~~ 



Protein name 



.Locus Name 



hypothetical protein APE1291 



pir :E>72£03 



Acc# 
D72603 



Description 



ORF Name 



NTID AAID 



12357711 13 63 



TTTT 



NT 
n 

7^" 



AA 

— , ■- — ■ Score 
Length Length ■ ■ 

R~3TT~ 



Probability 
|3.4e-41 2 



Protein name 

Description ; . . 

MOLYBDENUM TRANSPORT' ATP -BINDING PROTEIN MODI) 



Locus Name 



sp:M0DD__A20Vl 



Acc# 
P37732 



ORF Name 



NTID AAID . 



.NT AA 
— ■ — , Score 
Length Length ■■■»■■ — 



15710327 c l U4 



TT7TT 



7^" 



1427 



Probability 
15. Oe-40 : 



Protein name 



Locus Name 



putative chaperone 



gp : P5NARXL 



, Acc# 
Y15252 



Description 



Pseudomonas aeruginosa narX, narL, narKl, narK2 > narG, narH " narJ , nar I , 



nifM, mpaA genes.. 



ORF Name 



NTID AAID 



15781576 . C2 . 103 



11779 - 



3S95 



NT , . AA 

Length Length 

223 ■ 



Score' ; Probability " 



Protein name 



Description 



Locus Name 



sp:YADlM^OLl 



Acc# 

P36857:P75 
656 



HYPOTHETICAL 25 


1 KD PROTEIN, IN HPT- 


PAND INTERGENIG 


REGION 




ORF Name 

— ! • 

/ 


, NTID AAID 


NT - . AA' .-; 
Length Length 


Score 


• Probability 


19735188_13^58 


1780 3700 . 


677 . 2034 


|484 | 


|3.«e-70 ; : 



Protein name 



. Locus Name 



nitrate/nitrite sensory protein ' ; 



gp : P^NARXL ■ 



Acc# 
Y152 52 



Description 



Pseudomonas aeruginosa narX, narL 7 narKl,, narK2, narG, narH, narJ,narI, ' 
nifM, mo a A genes. - , • 



43 9 



ORF Name , . 


NT 

NT ID AAID ■ , T ' ' 

■< Length 




AA 

, — . , Score 
Length • 


Probability 


19806552_t2_31 , 


|178i 3701 ,■ 187 ; 




. 564 .. 134 


5.5e 


-08 


Protein name 






Locus Name ' 




Acc# 


Notch homo log ; ' , 


|gp:AF033013 . 


AF033013 ■ . 


Description ■ , ' 




Bomoyx mori Notch homolog mRNA, partial' cds . - . . ' 




ORF Name 


NT 

NTID AAID , 

Length' 


'AA . . ' 
— , Score 
Length 


Probability 


19806552_t3^51 


|1732 3702 ' 180 




543 , 142 


1.3e 




Protein name 






Locus Name 




Acc# 


Notch homolog 


gp:AF033013 




AF033013 


Description 












Bombyx mori Notch homolog' mRNA, partial cds. 




ORF Name ■ 


. v NT 
NTID AAID : • — 

Length 


AA 

f ^— ■ Score 
Length . 


Probability 


2Q423500_t2_33 


|1783 3703 259 




810 |420 | 


|2 . 7e 


-39 


Protein name 

*■" . ■ 






Locus Name 




Acc# 




• • 




sp:M0EB_5ALTY 




Q56067 


Description 












MOLYBDOPTERlN BIOSYNTHESIS MOEB PROTEIN • ■ - 




ORF Name 


. . «' . , NT 
' NTID AAID — " ' 
Length 


, AA ■ • . ' 
T — ^ . Score •.■ 
, • Length , 


Probability 


205S7686_li_8< 


1784 . 3704 139 . k 




420 : ; 






Proteiri name' 






Locus Name 




Acc# 


Description 












NO-HIT , 




ORF Name 


NT - 

NTID AAID — ■ 
Length 


' ' ! AA 

, -t— . , Score , 
- Length 


Probability 


2087638V tl 13"', , / 


-1785 '• 3705 | 234 . 




705 - 247 \ 


5 . 9e- 


-21 


Protein name 






Locus Name , 




Acc# 








,sp:YTTM_ECOLT 




P32157 , 



Description . ' ■. \ ' "'' "'' v A , . . .. •' 

HYPOTHETICAL 26.6 KD PROTEIN IN KDGT-CPXA INTERGENIC REGION (0234) 



440. 



ORF Name 


NTID AAID 


NT 

Length 


AA 

, — Score 
Length 


Probability 


21485962_c2_129 ' 


,1786 3706 


- 63 


189 




Protein name' 








Locus Name 


Acc# • 


Description 












NO-HIT . t ■ " . ; " . r ; 


ORF Name 


NTID ' AAID. 


nt 

Length 


•AA 

T — L1 Score 
Length 


Probability 


|21673452_c3_i41 


1787 3707 


448 | 


|1347 | 1455 


5.8e-149 


Protein name 








Locus Name 


Acc# 


nitrate extrusion protein 


gp : PSNARXL 


Y15252 


Description 










Pseudomonas aeruginosa narX, narL, 


narKl, narK2; ■ narG,, narH, 


narj,narl ; 


nifM, moaA genes/ 












ORF Name 


NTID AAID 


NT 

Length ■ 


AA' ■; - 

— , Score 
Length: 


' Probability 


2i(588888_ci_76 


.1788 3708 " 




999 |1035 | 


JI.8e.-i04 


Protein name , 








Locus Name 


Acc# 



sp:THIT_SALTY 



Description 
THIAMINE BIOSYNTHESIS PROTEIN THTT 



P55913 :O06 
955 



ORF Name 


NTID 


AAID 


NT 
-: Length 


AA 

T — . •• Score 
Length 


, Probability. 


220007i7_ii_19 


|1789 


3709 


145 - 


438 




Protein name 








Locus Name 


■' Acc# 


Description 












NO-HIT 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— ' ' Score 
Length 


Probability 


22378418_tl_17 


1790 


3710 - 


283 


-852 502 


b.6e-48 - - 



Protein name 



Locus Name 



sp:MODA_HAEIN 



Acc# ■ 
P45323 



Description ' 
MOLYBDA T E- BINDING P E RIPLASM IC PROTEIN PRECURSOR 



441 



ORF Name 




NT ID AAID 


NT 
Length 


AA 
Length. 


Score 


Probability 


2 2bb.4 0 31_£3_ 




1791 5711 


194 


58b ■ 


499 


1.2e-47 ■ 


Protein name 








Locus 


Name 


Acc# 












sp:MOAB_ECOLI 


P30746 


Description 
















MOLVUDUNUM 


COFACTOR. 


D1UO 11M IflljDlO 


tr Ia. w l Ci 11M Jj 










ORF Name 




NT ID AAID 


NT- 
Length 


AA 
Length 


Score 


Probability 


24068812_il_ 




1792 : . 3712 


| 258 


|777 | 


[561 


i.le-54 



Protein name 



nitrate/nitrite regulatory protein 



Locus Name 
. |gp :P3NARXL 



Acc# 
. Y15252 



Description 



Pseudomonas aeruginosa narX, narL, 
nifM, moaA genes. 1 



narKl ,• narK2 , narG , narH, narJ, narl ~ 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — , , Score-, . 
Length 


Probability 


24423375_11_5 > 


1793 


3713 • 


71 


216 • 




Protein name 








Locus Name ' 


Acc# 


Description 












NO-HIT 












ORF Name : 


NTID - 


. AAID 


NT.. 
Length 


AA 1 
, , Score 
. Length 


Probability 


24423375_12 _32 


| 1794 


3714 


65 


201% . ; , 




Protein name 








.Locus Name 


Acc# 


Description 












NO-HIT , . . 


ORF Name 


NTID 


AAID 


; '* NT, 
Length 


' AA 

— , . Score 
Length 


Probability 


p46SiB36_tl_16 


| i7&5. 


3715 


200 , | 


603 < - 310 


l,2e-27 | 


Protein name 








Locus Name 


Acc# . : 



Description 



sp:Y903__SYNY3 



Q55371 



HYPOTHETICAL "IS. -5 PROTEIN SLR0903 



ORF Name 



NTID 



AAID 



Protein name 



MHC class I antigen 



Description 



ORF Name 



NTID AAID 



2 7 528 j tl lb 



T7TT 



TTTT 



P rote in name 



hypothetical protein Rv24 53c 



Description 



NT AA 7 • 
— , T • — , Scor e 
Length Length ' — — 



TUT 



1017, | |82 | r 



Locus Name 



fair: 157454" 



NT 



AA •■ 



Locus Name 



bir:E>70864 



Probability 
0.048 ! ~ 



Acc# 
157454- 



Score Probability 



Length Length 

pi ,| |14!> | f77¥i^TCr 



Acc# 
D70864 



ORF Name 



2853417 tl 11 



Protein name' 



Description 



NO-HIT 



NTID . AAID 



T7W 



T7TS" 



NT AA 
Length' Length 
66 \ 



Score Probability 



!T0T~ 



Locus , Name 



Acc# 



"ORF Name 



29432768 c2 123 



Protein name 
Description 



NO-HIT- 



NTID AAID 



NT 



AA 



T7W 



TTTT 



Length Length 
91 ' 



Score Probability 



, Lopus. Name 



ACC# 



ORF Name 



30509827 tl 7 



Protein name 
Description 
INO-HIT " V . 



NTID-. ,AAID 



NT 



AA 



TSW 



T727T 



Length - Length 
134 



Score ' Probability 



I! 



3"5~ 



' Locus Name 



Acc# 



ORF Name 



NTID . AAID 



NT 



AA 



Length Length 



Score, Probability 



|31453152_c2_12U . |1801 


|3721 | |/v 


234 • 197 


1.2e-lb . - | 


Protein name 




Locus Name 


Acc# • 


hypothetical protein .. 


gp:AP2ii822. 


AF213822. 


Description ' - 


Zymomonas . mobilis ' strain 


ZM4 tosmid clone 42B3, complete sequence. 


ORF Name ■ NTID 


NT 

AAID — ■ 
Length 


AA 

— , Score 
Length 


Probability . 


3333U3*_clJ3 . . 1502 


, 3722. . 521 


1556 2352. | 


|4.4e-245 



Protein name 



Locus Name 



respiratory nitrate reductase beta- subunit 



E 



p : P^NARXL 



Acc# 
Y15252 



Description 



Pseudpmonas aeruginosa * narX , narL 7 
nifM, moaA genes . 



narKl, narK2 , narG, narH, narJ, narX , 



ORF Name 



NTID 



AAID 



3375851b 11 5 



18 03 



3723 . 



NT AA 
Length Length 
174 •" I 



Score . Probability 



Protein. name 
Description ' ■( 
NO -HIT ■ ^~ 



Locus Name 



Acc# 



ORF -Name 



' NTID 



AAID 



NT AA 0 " 

— , — - , Score 
Length Length . ■ ■ ■ 



35351552 ±3 51 



T75" 



IT5~ 



Probability 
3.8e-10 — " 



Protein name 



Locus Name 



hypothetical protein ssri527 



pir:S75710 



Description 



Acc# 

S75710 :S75 
718 



ORF Name - 



35371012 c2 .102; 



Protein name 
Description 
NO-HIT — — 



NT AA 
NTID AAID ■ ' — . — _ • Score 
— — " • — — ..* , Length Length — — - — - 



Locus Name 



Probability 



Acc# 



444 



ORF Name 



NT ID AAID •. _ — 



3906bbb t l l 3b 



NT 
n 
T72 



t — V-u Score 
Length Length — — — 



Probability 
3 .8e-26 



Protein name 



Locus Name 



probable moiybdenum-pterin-binding-protein 



pir:SS7554 



ACC# ■ 
S57954 



Description 



ORF Name 



NT ID AAID 



|4011062_cl_91 



3727 



NT 
n 
¥2T 



AA 

. — - Score 

Length Length — — — 



T2"8T" 



Probability 
7.4e-ll7 — 



Protein name 



Locus Name 



nitrate extrusion protein 



gp : PSNARXL 



Acc# 
Y15252 ' 



Description 



Pseudomonas aeruginosa narX,. narL, narKl, narK2, ' nar(3, narH, narJ, narl , 
nifM, moaA genes . , 



ORF Name 



NTID AAID, 



NT • AA 

— , . ^ , — , Score : 
Length Length ^— — : — 



Probability 



4070308_c3_134 


1808 . 3728 


• 442 


|1329 , |575 | 


|2 . oe-66 


Protein name 






Locus' Name 


. Acc#" • 








sp:MOEA_HAEIN 


. P45210 


Description" ■ 










M0LYBD0PTER1N 


BIOSYNTHESIS MOVA- 


MOTUIN 






ORF Name ... 


NT ID AAID 


NT 
Length 


AA o . ./ . 
— , Score 
.Length 


Probability " , " 


4344003_t3_b9 


1809 \ 3729 


364 


1095 |737 | 


|7.0e-73 . :. 



Protein name 



t Description 



Locus Name 



sp:MOAA_HAEIN 



Acc# 
P45311 



MOLYBDENUM COt^CTOR BIOSYNTHESIS PROTEIN' A 


ORF Name ' J ' NTID AAID , ■— . , 
. • Length 


AA 

_ — . , Score 
Length 


Probability 


4788876_c3_133 1810 3730 248 . ■■ 


J747 | |704 | 


|2.2e 




Protein name r - 




Locus Name 




A.cc# . 


respiratory nitrate reductase gamma subunit 


gpi-PSM&RXL 

. .1 ' 


- Y15252 



Description 



Pseudomonas aeruginosa narX, . narL, narKl , narK2 , narG, narH, , narJ^arl, 
nifM, moaA genes. . \ ' : ' . 



445 



ORF Name 



NTID 



AAID 



14797093 FT b2 



11811 



T73T" 



NT AA 
Length - Length 
1157 ■. 



Score Probability - 



Protein name 
Description 
INO-HIT ' — 



Locus Name 



Acc# ; 



ORF Name 



NTID 



AAID 



4806502 c2 127 



NT - AA 

Length Length 
102 ; 



Score 



JUT 



ITT 



Probability 
3. be -07 . 



Protein name 



Locus Name 



negative regulator ot translation 



Acc# ". 
AF213822 



Description , , . " ' . ' . ; 

Zymomonas mobi lis strain ZM4 tosmid clone ,42B3, complete sequence. 



ORF Name 



NTID 



AAID 



48862b! i2: 35 



NT ' , AA 
Length Length 
168 



— r. ■ Score 



Probability 
2..7e-32 ' 



Protein name' 



Locus Name 



molybdenum cotactor biosynthesis protein C 



gp:AF108766 



Acc# , 
AF108766 



Description 



RhodoJbacter sphaeroides AsmA (asmA). gene , partial cds ; YbaU : ~~ 
(ybaU) /ahthranilate synthase component I (trpE) , YibQ (yibQ) , 
anthranilatesynttia'se component II (trpG) , '■" ! . V 

anthranilatephosphoribosyltransf erase • (trpD) , indole-3-glycerol "' 
phosphatesynthase (trpC) , molybdenum cpf actor biosynthesis protein C 



' ORF Name 


, NTID 


AAID 


. NT 
Length 


AA 

, — . , Score 
Length 


» • 

Probability 


|4897576_c3_147: | 


[1314 


3734 


1 69 1 


I 210 1 




Protein name. 








Locus Name 


• ■ Acc# ■ • 


Description • ' ■ . 












JNO-HIT • •, 


: * ORF Name • , 


NTID 


AAID 


. , NT . 
Length 


AA 

T - — Ll Score 
Length 


" Probability 


5282812_i3_56' 


1815 , 


1 1735 


55 | , 


P « | p | 


JO. 032 | 


Protein name 








, Locus Name 


Acc# 


mdpi . ; ' 


gp:AB013441 


AB013441 , 


Description . .' '/ * 


Mycobacterium bovis 


gene 


tor MDPI , 


complete cds . 





446 



ORF Name 



NT ID 



AAID 



1530053 12 47 



NT , AA 
Length ' Length 

— i 11158 



Score Probability 
|320 | |i.le-28 ~ 



Protein name 



Locus Name 



ORF396 protein 



Acc# ' 
Z73914 



Description , 5 

Pseudomonas stutzeri ort!75 gene. 



ORF Name . 


NT ID 


AAID 


NT 
Length 


AA' . ■ 

T — . i Score 
Length 


Probability 


63S903_11_10' . 


| 1817 


,3737 


1 75 


228 






Protein name 








Locus Name 


" Acc# 


Description 1 














NO-HIT " t] ' • ,. 


ORF Name 


NTID 


AAID 


■ NT, 
Length 


AA 

, — , ,< " Score 
Length, 


Probability 


7064692_ci_86 


1818 


3738 


355 


1110 


415 


9.3e-39 



Protein name 



'Locus Name 



NitM protein 



Igp : ;-PSKfARXL. 



Acc# . 
Y15252 



Description 



Pseudomonas aeruginosa narX, narL, narKl, narK2 , narG, narH /: narJ.,narI, 
nifM, moaA genes. ■ . ' 



ORF Name 



NTID 



AAID 



7225637 13 50 



TZTT 



T7T9" 



NT 
ng 
T71U 



AA 

t Score 
Length Length . — 



Probability 
540 | |6-2e-50' ~ 



Protein name 



Locus Name 



tilamentous hemagglutinin- like protein 
PspA: probable secreted protein , - " 



pir :T09083 



Acc# ' 
TO 90 83 



Description 

ORF Name 
|9814751_c3_131 
Protein name 



NTID 



AAID 



1820 



3740 



NT: AA 
Length Length 
11271," | 13816 



Score 



15075 



Locus Name 



alpha- subunit ot nitrate . reductase 



|gp:Pt , U71398 



Probability 
|0-.0 ~ 

... Acc# 
U71398" 



Description 



Pseudomonas tluorescens nitrate reductase alpha- summit (narG) 
andbeta- subunit (narH) genes, partial cds . •. , 



447 



ORF Name 



NTID AAID 



11 7 fcbV6- El 24 



T7¥T" 



NT AA 
Length Length 
157 



Score Probability 

vhtt 



|4.7e-28 



Protein name 



Description 



Locus Name 



Acc# 



spiYAIIJJCOLI . 



J P52088:P75' 
703 



HYPOTHETICAL- 17' 


0 KD PROTEIN IN PR0C-AR0L INTERGENIC 


region 




. ORF Name 


NT AA 

NTID" AAID T - . 

Length Length 


Score 


Probability ■, 


14647507J:2_20 


1822 ■ ,3742 405 ' 


1218 


386 


i.ie-35 



Protein name 



Locus Name 



conserved hypothetical protein aq^_740 



pir:A70365 



Acc# 
A7036 5 



Description 
ORF Name 



NTID AAID 



' NT AA 

— ■ , : — ■ . Score Probability, 
Length Length ■ — rr~ ; — : — 



123443752 El 23 



-WIT 



TTTT 



1 



|3.4e-23 



Protein name 



Locust Name 



sp:YTRP_PSEPU 



Acc# 
P40604 



Description . 
HYPO T HETICAL 62.7 KB PRO T EIN IN TRPE- T RPG, IMTERGENIC "RUO I ON PRECURSOR 



..ORF Name 



NTID AAID . 



NT. AA 
■*-r , — Score 

Length Length — - — — 



123510536 c3 58 • 



T7TT" 



5T5~ 



Probability 
S.7e-75 — - 



Protein name 



Locus Name 



sp/:-YQ01i_HAfc!lN- 



, ACC# 

P44197 



Description 








HYPOTHETICAL PROTEIN HI 14 3 5 


, ORF Name NTID • AAID 


NT. 
Length 


AA : 
. _ — . , \. Score 
Length , 


Probability 


24544035_c2^45 1 1825 3745 


219 | 


650 | 255 


|8.4e-22 



Protein name 



Locus Name 



probable citrate lyase beta chain 



pir :T3bU62 



Acc# 
T35062 



Description 



448 



ORF Name 



NT ID - AAID 



NT AA 

t ~ t ~~ \ , Score 
Length Length . . 



1250251 12 18 



[T7T5~ 



T7T 



Protein name 



Locus Name 



sp:PUR6JiAEIN- 



Probability 
p.0e-50 

Acc# 
P43849' 



Description ■ 

(EC 4,1 .1.21) (AIR .CAkfiOXVLA^} (Alttt!) 



ORF Name 


NTID ' 


■ AAID 


NT 
Length 


AA 

T — . , Score 1 
Length 


Probability 


254flS763^ti_i : 


1827 


|3747 




|270 | 




Protein name 










Locus Name 


Acc# 


Description 






t- ,-. ■ 








pro-HiT • .* 


ORF Name 


NTID 


AAID ; 


NT 
Length. 


AA 

, — . , Score 
Length 


Probability 


|2S510974_cl_34 


| |1828 


|3748 


IT 8 . 5 1 


558 , |33i | 


7.4e-30 ; | 


Protein name 










Locus Name 


Acc# 












sp:Y6HQ_.EC0LI 


P77234 


Description 














HYPOTHETICAL 37.3 


KJJ PROTEIN IN LEUS-GLTL • 1NTERGENIC REGION ' 




ORF Name 


. NT ID" 


AAID 


NT. 
Length 


AA 

T — Score 
Length 


.Probability 


29304668_i3_30 


■1829 


3749 


302 ; 


909' 556 


I.le-b3- 


Protein name 










Locus . Name^ 


Ac,c# ' 












sp:SYK3_EC0LI 


Description 












141 , 


HYPOTHETICAL LYSYL 


-TRNA SYNTHETASE 


H0M0L0G / 


(GX) , 




ORF. ' Name 


NTID 


AAID 


NT . 
Length 


AA , . ' 
_ — . , , Score ■ 
Length 


Probability ... 


3127.530i^tl_il- 


11830- 


3750 


352 | 


1059 ; 516 


|1.8e-49 " 



Protein name 

Description . t ' 1 . 

ERYTHRONATE- 4 - PHOS PHATE DEHYDROGENASE , 



Locus Name 



sp : PDXB EC0L1 



Acc# , . 
P05459- 



449 



ORF Name 



NT ID AAID 



34179211 ti- 24 



NT 
Length 

330 . . 



AA 
Length 

1993 



Score Probability; 
|I.2e-08 



Proteih name 



Locus Name 



probable: protein serine- threonine 
phosphatase 



|pir:C7^97 



Acc# - 
C75297 



Description 
>ORF Name 



NT ID . AAID 



, NT 
Length , 



|363,6b6i>b_12_17 



11832 



13752 



AA 
■Length 
1444 



Score Probability 
|1.2e-31 . ' 



Protein name 



Locus Name 



hypothetical .protein jhpl377 



pir:D71815 



Acc# 
D71815 



Description 



ORF Name 


NTID 


AAID .' 


. NT = '< 
Length 


AA 

, , Score 
Length 


Probability 


5li8952_cl_35. 


J 1«33 


3753 


403 


1212 |7 0 3 |' 


|2.8e-69 


Protein name 

t. . . .■ / 








■ Locus Name ... 


Acc# 










: sp : PYR2_P5EAE 


-7' Q51551 -. , 


Description. 










■ • ' j j 


CATALYTIC CHAIN) \ , ' -\ ' ; , . 


ORF Name - 


' NTID • 


AAID 


NT • 
Length 


AA 

— , Score 7 
Length 


Probability - , ■:: 


5275300_.t3i22, : , 


-. 1834. 


3754 


347 , . 


; .|1044 954- • 


' 7 . le- 96 


Protein name 








Locus Name 


" Acc# ' 










. sp:BI0B_EC0LI 


. P12996 


Description 












B TOT IN SYNTHASE, 

■( 


(BIOTIN SYNTHETASE) 








ORF Name . 


• NTID 


AAID" 


, NT. 
Length 


AA 

' . — , , Score 
Length •.. • 


Probability , 


5948342_r3_27 


| |1835 


] 3755 


259 .. 


7 780 530 


6,0e-5i . 


Protein name. 








Locus Name 


. Acc# 



|sp:PT»K_PSEAE- 



P72158 



Description 
(AIR CARBOXYLASE) (AlkCJ 



450 



ORF Name 


NTID 


AAID 


NT 
Lenqth 


AA 
Length ." 


Score 


Probability 


I7041_il_8 


1 1836 


| 3756 




399 


.1155 


|l,5e 


-.10 


Protein name 








Locus Name. 




Acc# 










■ sp : PUAK_AQUAE 




066608 ' 


Description 
















(AIR C ARBOX Y L.AS E ) 


(AIRC) 














ORF Name , 


NTID 


AAID 


TVTT" 

Lpncf th 


AA • 
Length 


Score 


Probability 


129.91392J:2_18 


1837 


1757 


101 V; 


F 


1160 1 


|9.7e 


-12 


Protein name 








Locus Name 




Acc#. 


unknown 








|gp:PDU08856 




U08856 


Description 
















Paracoccus denitriticans 


insert xon 


sequence 


151248b, 


completesequence . 


ORF Name 


NTID 


AAID 


In 1 

Length' 


AA: 
Length 


Score 


Probability 


15110912_c3J>9 


.1828 


. 3758 


358- :• 


. 1077 • 


po I 


|6.7e 


-68 


Protein name 








Locus Name 




. Acc#. 










sp:yqjM_BAcsu. 




E5.4550 


Description 
















probable NADH- dependent flavin oxidoreductase yqjm, • . 


,ORF''Name 


NTID 


AAID,. ■ 


NT 

Length 


AA 
Length' 


Score 


Probability 


1563278i_c2_j>i. 


1839 


/ 3759. 


<< 79 










Protein name 








Locus Name , 




•Acc# 


Description 
















NO-HIT,. : • ' " i; . ' . 


ORF Name' . 


NTID 


AAID 


' NT . 
, Length 


AA 
Length. 


Score 


Probability 


1W73U16J:3__31 


1840 


3760 


1 ^ .1 


. 1419 


|1772 | 


|1.5e 


-182 



Protein name 



Locus Name 



type I site -speci tic deoxyrit>onuclease , "Hsd ' 
chain Ritype I restriction enzyme, Hsd, chain 
R : type' I restriction-modification system, 



pir: JC5216 



Acc# 
JC5216 



Description . 



451 



ORF Name 


NT ID 


AAID 




V* J. 

Length 


AA. l - „ 
, — . , . • Score 
Length — - 


Probability . 


19823578_t3_2y 


.1841 


3761 




405 




1218- 130 


l.Oe 


-07 


Protein name 














Locus Name 




ACC# 


hypotnetical protein 












pir:A7?>542 




A75592 


Description 




















ORF Name 


NTID 


AAID 




NT 
Length 


AA . . • 
— . , Score 
Length 


Probability 


ai6420W_ci^70 


: 1842 


| 3762 




232 : 




699 686.:. 


|18e 


-67 


Protein name 














.Locus Name 




• Acc# 
















sp:YC78_mVElIN- 




Q57431 :O05. 
050 


Description 


















PUTATIVE NAD (P) H NITROREDUCTASE, • . 


ORF Name . 


. NTID 


AAID 




NT 

Length 


AA ' 
T — . , Score. 
Length 


Probability ' 


2I673201_t3_25 


1843 


3763 




220 --. 




663. . ' 364 


2 .4e 

•t 




Protein name , ; 














Locus Name 




Acc# '.. 


protein Tp70 












, pir:A71309 , 






Description 


















A71309 :S18 
231 :S19826 


' ORF Name 


NTID,, 


. AAID 




NT 
Length 


. aa; 

• ; — Score 
Length 


Probability •, 


2i89075_c2_66 


1844 


3764 




126 




381 . 195 


1.9e 




Protein name , 














Locus Name 




Acc# 
















sp:YPR0_0WEFU 




P21260 :P21 

261 '-■ 


Description < 


















HYPOTHETICAL 


PROLINE-ftlCH 


PROTEIN ( FRAGMENT 


):■ 










ORF Name . 


NTID 


AAID 




NT, 
Length 


AA 

— , k- Score 
Length 


Probability 


22143752_ti_9 


|1845 


3765 




636 




1911 P756J 


|7.9e 


-287 


Protein name' 














Locus Name 




Acc# 


type X site-specific deoxyribonuclease ; 'Hsd 






pir:JCb216 




JC5216', 


chain ,R : type I 


restriction 


enzyme , 


Hsd, chain 






















R:type I restriction-modification 


system, 













Description 



452 



ORF Name ' 


NT ID 


AAID 


■NTT 1 
IN 1 

'.' Length 


■ . — , "' Score 
Length - 

i - : . - 


Probability 


23437551_11_4 


1846 


3766 


83 


252 




Protein name 








Locus Name 


Acc# . 


Description 












NO-HIT . - .... ' 




ORF Name 


NTID 


AAID 


NT 
Length , 


— Score 
Length ■ 


Probability 


23490937_c3__8i 


1847 


3767 


93 ■ 


282 |73 | 


|0.037 . | 



Protein name 



Locus Name 



nicotinamide adenine dinucleotide 
dehydrogenase 



gp:AF0258J6 



Acc# : 
AF02 5836 



Description 



Echinostoma sp.I. Atrica nicotinamide adenine dinucleotidedehydrogenase ■ 
subunit 1 (NDl) gene, mitochondrial: gene encodingmitochondrial protein, 
partial" cds. .< ■ • ' 



ORF-. Name - . 


" NTID • . AAID 


NT 
■ . Length 


AA Score 
Length 


Probability 


24704462_c2_51 


1848 3768 


; 206 


621 |296 | 


|3.8e^26 


Protein name 






' ; Locus Name 


Acc#- ' 


c mnamy X - a 1 cohoi 


dehydrogenase . 




gp:At'083333 


AF083333' ' 


Description 










Medicago sativa 


c mnamy 1-alcohoi 


dehydrogenase (MsaCadl) mRNA, complete cds . 


ORF Name 


NTID AAID" 


NT ' 
Length 


AA 

: ■ — , Score 
Length . 


. Probability 


32667715_r3J26 . 


1849 3769 


|I88 


567 |106 | 


- |0 ,00093 -;• 



Protein name 



' .Locus Name 



hypothetical^ protein TP0570 



bir:H71308 .. 



Acc# 
H71308 



Description 



ORF .Name 



NTID 



AAID 



35350802 "FI -5 



TT7TT 



NT 
3n 
HZ7 



AA , 

■ — : '-. Score 
Length Length — 



"5TJT" 



Probability 
2.4e-31. 



Protein name 



Locus Name 



putative transposase 



|gp:AF0 07475" 



Acc# 
AF007429/ 



Description 

.Haemophilus paragaliinarum IS -like putative transposase gene , complete cds. 



453 



ORF Name 



NTID 



AAID. 



T77T" 



NT ' AA 

Length Length 
450 I 11383 



Score 



Probability 
1.7e-55 — 



Protein name 



Locus Name 



type I site^specitic deoxyribonuclease , Hsd ' 
chain S : type I restriction enzyme, Hsd, chain 
Srtype I restriction-modification system, 



pir: JC5218 



Acc# 
JC5218 



Description 
ORF Name 



NTID 



.AAID 



4032715 E3 27 



T5"5T" 



TTTF 



NT AA : 

■ — , — " Score 
Length Length 



Probability 
i.Ue-17 



Protein name 



Description ' 



Locus ■ Name , 



sp:Y4SM_RHISKf. 



Acc# 
P5.0358 



HYPOTHETICAL 14.4 


KD PROTEIN Y4SN . 








ORF Name 


. NTID AAID 


. NT' 
' Length 


• aa • ' . ..; 

— - , Score , 
Length 


Probability 


41ii333_t3_33 


1853 3773 : 


201« 


603 235 


|5,3e-20 • 


Protein name 






Locus Name . 


Acc# 



Description 



sp:NAHR_PSEP.U 



P10183 



TRANSCRIPTIONAL ACTIVATOR 


PROTEIN NAHR 








ORF Name NTID 


NT 

AAID ■ — , 
Length 


AA 
Length 


Score 


Probability 


4895127_cl_36 , 1854 


3774 , 116 


351 


334 


3 .6e-30 • 



Protein name 



Locus Name 



OrtS 



gp:AB011413 



Acc# :. 
AB011413 



Description 



Streptomyces griseus genes tor Orl2, Or£3, Ort4, Ort-5, 
arid complete cds. 


AtsA, Or£8, partial 


■ NT AA 

1 ORF Name NTID AAID - , — . , . , — , Score Probability ' 

Length Length — ; — — : — — L ~ 


7034808_cl_49 1855 3775 70 . 213 


51 | |0.033 



Protein name 



Locus\ Name 



hypothetical protein ZK856.5 



pir :T28044 



Acc# . 
T28044 



Description 



ORF Name 



NTID 



AA-ID 



NT AA 
T — ^v, t Score 
Length Length ■ — 



Probability 



/UoUUUl El __. 




_5 / 1 _L_L.lt> zyo 1 4 . 


TZT " i 


Protein name . 




" Locus Name 


Acc# 






sp:VOTG_ECOLI 


P55140. 


Description 








HYPOTHETICAL; 34 


9 KD . PROTEIN IN 


CYSJ-ENO INTERGfcJNIC REGION (0313) 




ORF Name 


NTID AAID 


NT AA ' 
__ — - _ . — - , Score Probability • 
Length Length . - - • 4 


7083578_cl_37\ 


1857 3777 


60 183 162 |2.4e 


- 11 



Protein name 



Locus Name 



NADP- dependent alcohol hydrogenase 



gp:LMFLI053 



Acc# 
AL121862 



Description ' ' 

Leishmania major Friedlin chromosome 23 cosmid L1063 , complete cds. 



ORF Name 



NT ID 



AAID 



9782555 tl 1 



i~5TT5~ 



NT AA 
Length Length ' 
554 I 11655 



Score 



|2b20 



Probability . 
|8:0e-252 



Protein name 



Locus Name 



ALXA and HSDM: 



gp:PHU45781 . 



U46781 



Description 



Pasteurella. haemolytica putative coproporphyrxnogen III' oxidase (hemN' ) gene, 
partial cds, leukotoxiri transcriptional activator andrestriction modification 
methylase subunit (alxA-hsdM) , (hsdS) and (hsdR) genes, complete cds.'- ., 



J ORF. Name 


NTID 


AAID 


NT 
Length 


AA . 

' — . . Score 
Length 


Probability 


14257i5u_ri_2- 


1859 


■ 3779 


294 ■ 


- 885 




Protein name 








Locus Name ■■ 


. Acc# ; .: 


Description . '„ 












NO-HIT ; . - .;' ■ 


ORF Name 


• NTID 


AAID 


NT 
Length 


AA 

, — . . Score 
Length 


Probability 


16180437_t3_18 


I860 


3780 


502 | 


p09 | |1450 | 


|1. 7e-149 


Protein name ' 








Locus Name 


: -. Acc# . . 



Description 
I, '"(SSDH). 



sp:GABD_EC!OLL- 



P2 5 52 6 



ORF Name 


NTID 


AAID 


19806552 til 






3781 



■ NT . , AA 
Length ' Length 
174 



^— , .-' Score 



Probability 
|2.6e-07 ~ 



Protein name 



Locus Name 



probable ankynn 



pir:H71274 



Acc# 
H71274 



Description 



ORF Name 



NTID 



AAID 



24407502 t3 17 



TTWF 



NT 
n 



\ AA . 

i t — 4-u Score 
Length Length — — 



Probability 
2,5e^36 . 



Protein name 



Locus Name 



glycine betame/ carnitine/ choline" ABC 
transporter (membrane p) opuCD 



pir : F69670 



; ACC# 

F69670 



Description 



. ORF Name 



NTID 



AAID 



2590021,2 c2 30 



T7FT 



NT AA ■ 

Length . Length 
TTT~ 1 . 112 5 4 



Score 



TUT 



Probability 
1.3e-05 



Protein name 



Locus Name 



.putative natural resistance-associated 



bp:CCA133.735 



ACC# , 

AJ133.73 5 



Description 



Cyprinus carpio mRNA tor 
protein (NRAMP) \ ',. 


putative 


natural ' resistance -associatedmacrophage 




ORF Name/ - " . ' : NTID 


AAID 


■ NT AA ~\ 

„ ~~ . , , — . , Score 

Length Length 


Probability 




34094385_ci 23 , . • [1864 


3784 


107 324 155 


6. 3e-ll 


Protein name. 
■A 4- 4- f : — ■ — —r- ! 




: . . , . Locus Name 


Acc# : 





|gp:U59485 



Description 



U59485 :L6 3 
54 0 



Agrobacterium tumetaciens AtrC (atrC). gene, partial cds ; AtrB (atrB) ", AtrA 
(atrA) , AttAl (attAl) .,, AttA2 (attA2) , AttB (attB) ,AttC (attC) , AttD (attD) ,". 
AttE (attE) / and AttF (attF) genes, complete cds; AttG (attG)' gene, 
alternative splice products, complete cds ; AttH (attH), AttI (attl) , AttJ 
(attJ) . AttK (attK) ,AttL (attL) , AttM (attM) , AttO (attO) , AttP (attP) , AttR 
(attR) , AttS , (attS) , AttT (attT) , AttU (attU) , attV (attV) , AttW (attW) , AttX 



ORF Name 



NTID AAID ' 



NT 



AA 



Length Length 



Score , Probability 



4770887_tlJJ. 1865 | |378b | 176 | 


531. po | ,|2.7e-14 


Protein name 




Locus Name Acc# 


hypothetical protein 




gp:SSUifl-330 Y18930 


Description , 






Sullolobus solrataricus 281 kb genomic DNA tragment, strain P2 . 



ORF Name 



NTID AAID 



14875260 c2 W 



NT , \- AA 
Length Length 
m — 1 flTT 



Score Probability 



Protein name 
Description 
WTO-HIT ■ 



Locus Name 



Acc# 



ORF Name 



4884702 cl- 27 



NTID 



rnnrr 



AAID 



NT AA 
' Length / Length 
1226 I 1681 ' 



' Score, 



[TOT 



Probability 
|1.4e-37 



Protein name 



Locus -Name 



NonF 



|gp.:AF0746-uT" 



■ Acc# 
AF074603 



Description 



Streptomyces griseus -subsp. griseus nonactm biosynthesis genecluster, 
partial sequence. . 



ORF Name 



NTID 



AAID 



— . ' — ' Score . Probability 



6740692 YT 10 



T7W 



NT 
Length 



AA 
Length 



Protein name 



Description 



Locus Name 



Acc# , 



WO- HIT 



ORF Name 



NTID AAID 



17218752 t3 15' 



NT ' AA 

Length 1 Length • 

~ 1 HZL 



Score 



Probability 
0.030 — 



Protein name . 



Locus Name 



putative polysaccharide polymerase 



gp:SPCPS14E 



Acc# 
X85787 



Description 



S. pneumoniae cpsl4 locus. 



457 



ORF Name 



NTID AAID \ 



786305 ti b 



1870 



3790 



NT 

TT7 



AA 

— , Score 
Length Length — 



Probability 
9.4e-62 — — 



Protein name 



Loc.us Name 



..probable osmoprotection binding protein 



pir:G71892 



Acc# . 
G71892 



Description 



■ ORF Name . 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


792090jfc2JL!i - 


. 1871 


3791 


148 


;' 447 . 






.t ■ 


Protein name 








. Locus Name 


Acc# 


Description 


i* 














NO-HIT 


ORF Name 


NTID 


" AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


12273,437^^58 


• 1872 


3792 ..' 


337 


1014 




1680 | 


|8.3e-173 . | 


Protein name 








Locus Name 


Acc#. 










sp : SYGA 


_M0RCA 


P77892 


Description 
















ALPHA CHAIN), 


(GLY&S) 














• ORF Name 


NTID 


AAID 


NT 
Length 


AA 
Length 


'r- 

. Score 


Probability" 


I418150uJ:2jj6 


187.3. 


3793 . 


693 . • 


2082 




1593 :. 


1.4e-163 ,- 



Protein name 



Description 



Locus 1 Name 



sp:SYGfe_HAEIN 



Acc# 

• : -P43822. 



BETA CHAIN). {(^LYRS) 


ORF Name 


NTID 


AAID 


NT 

Length 


AA . 
Length 


i Score 


Probability 












|196500S2_cl_101 • 


1874 


3794 . 


■ 279 


|840 


581 


. 2.4e-b6 - ; | 



Protein name 



•Locus Name 



sp:BUDC_KLEPN 



Acc# 
Q4 84 3 6 



Description . . ' 

ACETOIN(DI ACETYL) REDUCTASE, (ACETOIN DEHYDROGENASE) (Ak) 



ORF Name 


NTID 


AAID 


NT 
Length 


AA score 
Length 


Probability 


21G48382_:ti_22 | 


1875 


379S 


1 279 


|840 813 


6.2e 


-81 


Protein name 










Locus Name 




Acc# 












sp:ACCA_ECOLI 


P30867 


Description 
















(EC 6. ,4. 1.2) • ■■- 




ORF Name 


NTID 


AAID 


'NT 
Length 


AA „ 
„ — . », Score 
Length. 


Probability 


|21G50017_c2_112 / 


1876 


3V9fi 


2E>4 J 


|755 | 437 


4 . 3e 


-41 


Protein name 










Locus Name 




. Acc# 












sp : LPTP-ECOLI 


P23885 


Description 
















LEUCYL/ PHENYLALANYL 


-TRNA- 


-PROTEIN TRANSFERASE , , 










ORF Name 


NTID 


AAID ■ ■ 


NT 
Length 


AA „ 
— Score 
Length 


Probability 


21657752 C3_147 


1877 


3797 


34b 


1038; |580 | . 


13. Oe 


-56 


Protein name 










Locus Name 




Acc# 



Description 



sp:YZ37j3YNY3 



Q55480 



* HYPOTHETICAL SUGAR KINASE 


SLR053 7 








•;" ORF Name NTID 


, AAID '■ 


NT- 
Length' 


- AA . 
t — . , Score 
Length 


Probability 


2198781i L _i2_:34 1878 


3798 


227 • 1 


714 384 


6l2e-41 



Protein name 



Locus Name 



sp:PGSA_HAElM 



Acc# 
P44 528 



Description - 
EEC 2. ,7. 8. 5) ( PH03PHATiDYLGLYCER0&H0y PMATE SYNTHASE) (PGP SYNTHASE) : 



QRF Name ' 
22.038132 ±3 57 



NTID AAID 



3799 



NT. AA , 

Length Length . 
75" H 1231 



Score Probability 



Protein name 

Description 

NO-HIT 



Locus Name. 



Acc# . 



459 



ORF Name 


NTID 


AAID 


NT 
Length 


'.AA 

— : , Score 
Length 


Probability 


22384&28_rl_5 


1880 


3800 


44.8 


1347 |1005 | 


|2 . 8e 


-101 


Protein name 










Locus Name . 




Acc# 












sp:YKGC_ECOLI 




P77212 


Description _ 
















InTErGENIO. REGION ! : 


ORF Name 


NTID 


AAID. , 


NT 
Length 


AA 

. — Score 
Length 


Probability 


p34938I2_i2_33 ■ 


| 1881 


|3801 


998 


• 2997 972 | 


5.2e 


-119' : 


Protein name 










Locus Name 




Acc# 


metalioprotease 1 


gp:AF061243 


AF061243 


Description 
















Homo sapiens metalioprotease 1 (MP1) 


mRNA, 


complete cds. 






ORF Name 


NTID, 


AAID 


NT ; 

, Length 


AA 

-7- _ Score 
Length 


Probability 


23875027_12_50 


J1882 


|3802 


395 


1191 521 , 


1..4e 




Protein name 










Locus Name 




. ACC# \ 



Description 



sp :;RLUC_HAEIN 



P44433 





SYNTHASE) 


(URACIL HYDR0LYA3E) ; ; 






ORF Name • 


•NTID 


NT. AA 
Length Length 


Score 


Probability 


24ii8802_c3_138 


1883; 


3803 441 | 1326 


|I742 | 


|2.2e-179 . 



' Protein name 



Locus Name 



serine hydroxymethyl transferase 



|gp:AE0737"^- 



Acc# 
AF073769 



Description 



Acinetobacter radioresistens serine hydroxymethyl trans t erase 
complete cds . . 


tglyA)gene, 


NT 

ORF Name NTID ' AAID ■ ■ . , — ; 

Length 


AA 

—~ , Score 
Length 


Probability 


242S8777._c3_I34 1884 . 3804 1181 


3546 J1542 


l^e-UO . 



Protein name 



Locus Name 



nbonuclease E, :cell. shape 7 determining 
protein : message stability-altering 
protein : RNase E ; 



bir:S27311 



Description 



Acc# 

A64852 :S45 
572:S27311 : 
:A23747 : JG 



ORF Name 



NT ID 



AAID 



24900257 ti 11 



NT AA , 

Length , Length 
35 ~ I &ZT r 



Score 



Probability 
3.8e-10 



Protein name 



Locus Name . 



conserved hypothetical protein 



plrTBTTZZl 



. Acc# 
B72287 



Description 



ORF Name 


NT ID AAID 


NT 
Length 


AA 

— Score 
Length;, 


Probability 


25657776_cl_102 


1886 3806 


149 | 


|450 | jlSO . | 


|7.4e^l4 


Protein name. .. . 






Locus Name 


ACC# 








sp::PSPE_EOTLi 


| P23857 '. . 


Description 










PHAGE SHOCK PROTEIN 


E PREOimSOR' 








ORF Name ; 


NTID , AAID 


NT 
Length 


AA 

" . - . Score 
Length 


Probability 


26565686_c3_14y 


1887 p807 


348, | 


1047, 591 ■ 


5.2e-68 . - ., 



Protein name . 



Locus Name 



hypothetical protein sir0787 



. pir:S77001 



Acc# 
S77001 



Description 



ORF Name 



NTID'- 'AAID 



26754011 ■■■cl 86.- 



3808 



NT AA 
^ Length Length 
357 I [T07T" 



Score Probability 
[1765 | [8 ,le-182 ~ L 



Protein name 



Locus Name ^ 



NAD repressor/NMN transporter NadRp 



MOT73324 



Acc# - 
U73324 



Description 



Moraxella catarrhalis glycyl-tRNA synthetase beta subunit (GlyRS) and NAD 
repressor/NMN, transporter NadRp (NadR) genes ,\ partial cds,and glycylrtRNA 
synthetase alpha subunit (GlyRS) gene, completecds. 



ORF Name 



NTID AAID 



Score Probability 



2845637 c3 137 



Protein name 

Description , ( 
■CHORiaMA T E- -PYRUVAT E LYASE, 



NT : . AA 
Length , Length 

| 519 , | |168 | |i:4e-12 ~ " 

Locus Name . r Acc#. 



TTT 



sp,:UBIC_ECOLI 



P26602 :P76 
783 



ORF Name 



NTID 



AAID 



30332ml bi 



T8~9TT 



NT • AA 

Length . Length 
541 | [T6^ 



Score Probability • 



Protein name 



[1087 | |5.7e-110 

Locus Name • Acc# 



exopolyphosphatase 



|gp:AF053463 



AF053463 



Description 



Pseudomonas aeruginosa thioredoxm ( trx) and exopolyphosphatase (ppx) genes " 
complete cds . -. ■ ' 



ORF Name 


NTID 


AAID 


NT ■ 
Length 


AA 
Length 


Score 


Probability 


30726562_t3_62 ' 


| 1891 


pan 


1 768 


2307 


2256 


7.6e-234 



Protein name 



Locus, Name 



hypothetical protein 



gp : PFFC2 



Acc# ■/ 
Y11998- 



Description .■ t . , 

P. tluorescens FC2.1, FC2 . 2 , FC2 . 3 c / " FC2 . 4 and. FC2 . be open reading! rames . 



ORF Name 



NTID 



AAID 



33240686 c3v!40 



.NT AA 
Length Length 
260 — I 1783 ' 



Score 



TTT5" 



Probability 
|1.4e-09 ~ 



Protein name 



Description 



Locus Name 



sp:PNUC_ECOLI 



Acc# 

P31215 :P77 

227 ... ' . . 



PNUC PROTEIN 



ORF Name 


NTID 


" AAID 


NT ' 
Length 


AA 

■ — Score 
Length 


Probability ' 


34421878_c2_120 


1853 y 


3813 


408 ■ , | 


1227 991 


|8.5e-100 


Pro t e in name 




s 




Locus Name 


Acc# 










' sp:VHIN_ECOLI 


~ 1 P37631:P76 


Description 










" ' ' 705 ! 


HYPOTHETICAL 43 


8 KD PROTEIN IN RHSB 


-PIT INTERGENIC REGION 




ORF Name 


NTID 


AAID 


NT 
Length 


AA ' 
Length .. . 


Probability 


34578126_13_b4 


1894 


3814 


355 


|1068 | |91, | , 


J0.023. . 



Protein name 



Locus Name 



translation elongation tactor eEF-l alpha 
chain PIK-A49 rphosphatidylinositol 4-kinase 
activator PIK-A49 



pir:A4532b 



Description 



Acc# , 

A45325:B45 
325 :C45325 
:D4 532 5 :E4 



462 



ORF, Name 



NTID ' AAID 



4695252 Tl TT 



T5T5" 



NT ' AA 
Length r Length 

TTTT 



Score Probability 
642 | |497 | |i.9,e-47 ~ 



Protein name 



Locus Name 



Acc# 
P46847 



Description •." 
HYPOTHETICAL ^lvO KD PROTEIN IN BIOH-GNTT INTERGENIC REGION (0191) 



"C Mama 


NT AA 

1N11U HrtllJ < l ■ • ' OUuic 

Length Leiiyth' . 


xr L KJ DcL U X ± X U y 


4875885_c2_I26 , 


|1896 3816 164 | 495 .. ■ 120 


l,7e.07 


Protein name 


Locus Name 


Acc# 




sp:YFMU_COXBU 


P4 56 8 0 

> 


Description 






HYPOTHETICAL 15.8 


KD PROTEIN IN FMU-RPMH INTERGENIC REGION - 




• ORF Name 


' • ' ■ ' NT AA • 
NTID, ' AAID . 7 — , — : , ' Score 
Length Length 


Probability 


66517i2_ti_i5 


1897- 3817 ; 536 • ' . 1611 . 1545 


1.7e-158 


Protein name t 


Locus Name ■• 


Acc# 


isocitrate lyase 


gp:AB004651 . • 


AB004651 


Description ' it : ' 


Hyphomicrobium methylovorum' gene tor isocitrate lyase , inorganicphosphate 
transporter , methionine synthase , complete and partial cds . 


ORF Name 


NT AA ' 
NTID AAID „ — ■■ ■■ — ■ Score 
Length ■ Length 


Probability 


6759625^c3l_lS0 


|1898 . 3818 • , 187 | 564 |248 | 


|4.6e-21 



Protein name 



Locus Name 



hypothetical protein TP0895 



bir:t)71266 



• Acc# \ 
D712 66 



Description 
; ORF Name 



-NTID 



AAID' 



14642925 t3 23 



T^T9" 



NT . AA.. 

Length Length , 
751 1 11056 . 



Score Probability 



Protein name 
Description 
[NO-HIT ~ 



Locus Name 



Acc# 



463 



ORF Name 



NTID 



„ AAID 



166937b0 E3 17 



• NT. 
Length 
10 0 



AA 

T — 4-i, Score 
Length — - — : — - 

[IT7- 



Protein name 



■Locus Name 



conserved hypothetical' protein .yerL 



pir:ASS7&5 



Probability 
|2.7e-09. — 

' Acc# 
A69795 



Description 
ORF Name 



NTID 



AAID 



183-437- £3 18 



[1901 | 



3821 



NT 
Length 
1496 



Length 
11491 



Score Probability 
12443 | 



1.2e-253 



Protein name 

Description 

PUTATIVE AM PDAS E , 



Locus Name 



sp:AMiD_MORCA 



Acc# 
Q49091 



ORF Name 



NTID 



AAID 



1987793 12 15 



11902 



. NT 
Length 

254",; 



AA ; 

T ~~~ ^ Score 
Length 



[7W 



Probability 
189 | |8.1>e-15 ~ - 



Protein name 



Description . 



Locus Name, 



sp:MINC_ECOLI 



Acc# 
P18196 



CELL DIVISION INHIBITOR MINC 



ORF Name 



NTID 



AAID 



22078181. c3 52 



fTTOT" 



NT 
Length , 
137. 



AA 

.' — , Score 
Length — -, 



Probability 



[5TT" 



1151 I |7.6e-i2 



Protein name 



Locus Name 



mat -type, protein 



, pir:D72129 



Acc# 
D72129 



Description ; 



ORF Name 



NTID 



AAID 



, NT 
Length 



AA 



Length 



Score'' 



22942053 11 9 



J1904 




3824 . 




70 - 




210 



Protein name 
Description 
[NO-HIT — . 



Locus Name 



Probability 



Acc# 



464 



ORF Name 



NTID 



AAID 



23492792 c2 37 



TS2TT 



NT AA 
Length Length 
203 ' 



Score 



[ST2~ 



Probability , 
1.0e-09 — — 



Protein name 



Description 



Locus Name 



sp:CYC5_AZ0VI 



Acc# 
P11732 



CYTOCHROME 


C5 














ORF Name 




NTID 


AAID 


NT 
■Length 


AA 
Length 


Score 


Probability 


24259,651_t2; 


14 


, |I906 


|3826 


■319 | 


• : |960 


462 


9.7e-44 , 



Protein name 



Description 



Locus Name 



sp :YIHG_fcICOLI . 



Acc# • 
P32129 



HYPOTHETICAL 36.3 KD PROTEIN IN D3BA-POLA 1NTEUCEN1C REGION 


ORF Name 


NT AA ' 
NTID AAID — - ■" — , Score Probability 
Length LeiiyLh - 


2435155?_cl_28. 


1907 3827 ' 231 696 : 


• 389 , ■ 5 . 3e-36 



Protein 1 name 1 



Locus Name 



outer membrane protein homolog 



gp:AF067083. 



Acc# 
AF06 70.8 3 



Description 



Vitreosc.il.ia sp . outer 1 membrane protein homolog gene, complete .cds.; Trp 
repressor binding protein gene, partial ..cds ; and unknown genes.. 



ORF Name 


NTID 


AAID 


NT 
Length 


AA . 
Length 


' Score 


Probability, 


3p203430_c2_35 


|1908 


3828 


86 - 


: 261 






Protein name 








Locus Name 


' ' Acc# 


Description 














NO-HIT • ' . ; 


ORF Name ■ 


NTID 


AAID 


NT 
Length 


AA 
Length 


Score 


Probability 


3142 32.00_t3_25 « 


|1909 


3829 . 


181 


546 


622 . 


l.le-60 



Protein name 



Locus Name 



cell division -inhibitor minD: septum 
site -determining protein minD 



|pir:CCECID 



Description . 



Acc# 

B31877 :D64 
863 



ORF Name 


, NTID 


AAID 


NT 
Length 


AA 

r — . , Score 
Length 


Probability 


4729837_±3_19 


1910 


• 3830 


piv | 


954 1525 


4 . 4e- 


-167 | 


Protein name 










Locus Name 




Acc# 


BRO-1 ' 




g Pi :MC^LA]BR01 


Z54180 


Description 














M. catarrhalis 


bla gene . 
















ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— Score ■ 
Length 


Probability 


47910b3_cl_31 


1911 


3831 : 


p | 


219 |b4 | 


10.0053 1 


Protein name 






Locus Name 




Acc# : . 


gag protein 




gp:MUSERVGG2 


M26006 



Description 



Mouse endogenous 'retrovirus truncated gag gene 

15 . 3 ■ ' '<■ 


; complete cds, clonedel ,env-2 


ORF Name 


NT AA' 

NTID AAID , ■ — Score Probability 
Length Length •• 


975677_t3_20 


1912 |3832 494 . 


1485 . |2124 . 7:4e-220 



Protein -name' 



Description 



Locus Name 



|sp:YBL3_M0RCA 



Acc# 
Q49092 



| HYPOTHETICAL 45,4 


KD PROTEIN 


■ IN BLOR 


-1 (0RF3) 






. ORF Name 


■ -NTID AAID 


NT 
Length 


K " AA .-■ ■ 

— • Score 
Length 


Probability 


i4459535_12__5 


1913 


3 833 ■ 


715 


2151 1425 


8.7e 


-145 


Protein name. 








Locus Name, , 




. Acc# 










sp:Ot>t)A_HAEIM 




P44573 . 


Description 


" ■ \ :\ 












OLlGOPEPTlDASE A, ' - 


ORF Name 


NTID AAID 


NT 
Length 


" AA 

• " Score 
Length 


Probability 


19569430_c3_39 1 


1914 


3834 


275 


, '828 454 


6.8e 


-43 ■ 



Protein name 



Description 



Locus Name 



sp : YBHP^ECOLI 



Acctt , 

P75772 



HYPOTHETICAL 28.8 KB PROT EI N IN MOAE-RHLE INTERG E N I C REGION 



ORF Name 


NTID 


AAID 


NT 
Length 


AA 

_ — . , '■ Score 
Length . 


Probability 


21718878_cl_20 " 


, 1915 




269 | 


810. 




Protein name 








Locus Name 


' Acc# 


Description 












NO-HIT • 


ORF Name 


NTID 


AAID 


NT 
Length 


AA 

— , Score 
Length 


Probability 


22847175 J:3_15 1 


1916 


3835 


83 


252 181 J 


|0. 0023 



Protein name 



Locus Name 



sp : YHEV_ECOLI 



Acc# 
P56622 



Description 

HYPOTHETICAL 7.6 KD PROTEIN IN SLYD-'KEFB iNTEkGENIC REGION . 



ORF Name 



NTID 



AAID 



23445300 c3 37 



TTTT 



NT • AA 
Length Length 

9 23 r I \rm 



Score Probability 



Protein name 



|778 | |2.2e-100 ~ 

Locus Name . ■ , Acc#, 



prolyl oligopeptidase , precursor 



pir : A38086 



\. A38086 



Description 

■ ORF Name .., 
[3907568_c2_28 



NTID 



AAID 



Protein name 



ORF102 



NT , AA 
Length ' Length 
124 



( - Scor e. Probability ' 
170 I . 10.033.' ^ 



Locus Name 



E 



p:AE162221 



Acc# 

AF16222i 



Description 

Xestia c-nigrum granulovirus genome , . complete sequence. 



ORF Name 



NTID 



AAID 



4773287 cl 26 



3839 



NT AA 
Length Length 



Score 



] 



Probability 
[4.4e-48 . 



Protein name 



Locus Name 



|sp:YGGV_EC!OLl 



Acc# ■ 
P52061 



Description 

I- HYPOTHETICAL 21.0 KD PROTEIN . 1U cSSHB-AWy'-B 1WTERGEN1C RUUION ■ (0197) 



ORF Name 


NT ,. AA 
NTID AAID • — ;■ — , Score 
Length Length , - 


Probability 


964212_c3_35 


1520 3540. 410 , | 1233 | |104 | 


|0.0091 


Protein name 




Locus Name 


Acc# 


voltage ^dependent 


anion channel protein lb 


gp:AFi78!^i 


AF178951 


Description 


Zea mays voltage- 
cds; nuclear gene 


dependent anion channel protein lb (vdaclb) 
for mitochondrial product.- 


mRNA, complete 



•r 




468 



